I have been recently working on the Meltdown and Spectre bugs and assessing what the impact will be to our servers. As part of this testing we used a tool called SLOB which we use to stress memory accesses (cached reads) and also I/O based on random and sequential I/O.
The database in use was running on a Veritas file system and when running the misc/awr_info.sh script to parse the results of multiple tests we were met with incorrect data.
# ../misc/awr_info.sh *.patched_CachedReadOnly* FILE|SESSIONS|ELAPSED|DB CPU|DB Tm|EXECUTES|LIO|PREADS|READ_MBS|PWRITES|WRITE_MBS|REDO_MBS|DFSR_LAT|DPR_LAT|DFPR_LAT|DFPW_LAT|LFPW_LAT|TOP WAIT| awr.txt|12|0|0.171383|.1713828042|0|0|0| 0.0|0| 0|0| 0|0|0|0| 0|DB CPU 721 100.0|
However running on a local file system we were getting the correct data for LIO and the other fields.
awr.txt|12|61|11.8|11.8|99401|6472752|0| 0.1|0| 0|0| 0|0|0|0| 0|DB CPU 721 100.0|
Breaking it down to a simple test case.
I started by reading the awr_info script looking for any errors and checking what the script does so I could break this down to a simple test case.
The script copies the original awr file to a temporary dotfile, parses and then removes. So with this in mind I created the following test file with junk data and a specific line to read.
A=0 B=3500 while [ $A -ne $B ] do echo "TESTING TO SEE IF WE CANT GREP THIS FILE" >> /var/tmp/bug_data ((A=$A+1)) done echo "Logical read (blocks): 2391980.5 1024.3" >> /var/tmp/bug_data
Once we have the file created we just need to do the copy and then grep the appropriate data.
cd /vxfs_fs/ tmpfile=${RANDOM}0001 ; cp /var/tmp/bug_data $tmpfile ; grep 'Logical' $tmpfile We then repeat until it fails, below shows a few runs. root# tmpfile=${RANDOM}0001 ; cp /var/tmp/bug_data $tmpfile ; grep 'Logical' $tmpfile Logical read (blocks): 2391980.5 1024.3 root# tmpfile=${RANDOM}0001 ; cp /var/tmp/bug_data $tmpfile ; grep 'Logical' $tmpfile Binary file 234500001 matches
So there we have a bug where the grep command is failing believing the file to be binary.
The issue here is that the Veritas file system has dalloc enabled by default which does a lazy allocation of extents (VXFS is an extent based file system). The grep command uses SEEK_HOLE to determine the file size however the Veritas vx_seek_hole_data() has a bug which does not return this size correctly.
To workaround this issue until a patch is released you can switch off dalloc on the file system & also update the tunefstab.
I will update once the patch is released.
*update 16/05/2018* Changed ported to Infoscale 7.3.1 via e3938258
vxtunefs -o dalloc_enable=0 /MOUNT_POINT 2. Create the /etc/vx/tunefstab file, and add the entry: /dev/vx/dsk/$DISKGROUP/$VOLUME dalloc_enable=0