Finding a bug in VXFS 7.3 with grep

I have been recently working on the Meltdown and Spectre bugs and assessing what the impact will be to our servers. As part of this testing we used a tool called SLOB which we use to stress memory accesses (cached reads) and also I/O based on random and sequential I/O.

The database in use was running on a Veritas file system and when running the misc/awr_info.sh script to parse the results of multiple tests we were met with incorrect data.

# ../misc/awr_info.sh *.patched_CachedReadOnly*
FILE|SESSIONS|ELAPSED|DB CPU|DB Tm|EXECUTES|LIO|PREADS|READ_MBS|PWRITES|WRITE_MBS|REDO_MBS|DFSR_LAT|DPR_LAT|DFPR_LAT|DFPW_LAT|LFPW_LAT|TOP WAIT|
awr.txt|12|0|0.171383|.1713828042|0|0|0| 0.0|0| 0|0| 0|0|0|0| 0|DB CPU 721 100.0|

However running on a local file system we were getting the correct data for LIO and the other fields.

awr.txt|12|61|11.8|11.8|99401|6472752|0| 0.1|0| 0|0| 0|0|0|0| 0|DB CPU 721 100.0|

Breaking it down to a simple test case.

I started by reading the awr_info script looking for any errors and checking what the script does so I could break this down to a simple test case.

The script copies the original awr file to a temporary dotfile, parses and then removes.  So with this in mind I created the following test file with junk data and a specific line to read.

A=0
B=3500
while [ $A -ne $B ]
do
echo "TESTING TO SEE IF WE CANT GREP THIS FILE" >> /var/tmp/bug_data
((A=$A+1))
done
echo "Logical read (blocks):       2391980.5           1024.3" >> /var/tmp/bug_data

Once we have the file created we just need to do the copy and then grep the appropriate data.

cd  /vxfs_fs/
tmpfile=${RANDOM}0001 ; cp /var/tmp/bug_data $tmpfile ; grep 'Logical' $tmpfile

We then repeat until it fails, below shows a few runs.

root# tmpfile=${RANDOM}0001 ; cp /var/tmp/bug_data $tmpfile ; grep 'Logical' $tmpfile
Logical read (blocks):       2391980.5           1024.3
root# tmpfile=${RANDOM}0001 ; cp /var/tmp/bug_data $tmpfile ; grep 'Logical' $tmpfile
Binary file 234500001 matches

So there we have a bug where the grep command is failing believing the file to be binary.

The issue here is that the Veritas file system has dalloc enabled by default which does a lazy allocation of extents (VXFS is an extent based file system). The grep command uses SEEK_HOLE to determine the file size however the Veritas vx_seek_hole_data() has a bug which does not return this size correctly.

To workaround this issue until a patch is released you can switch off dalloc on the file system & also update the tunefstab.

I will update once the patch is released.

*update 16/05/2018* Changed ported to Infoscale 7.3.1 via e3938258

 vxtunefs -o dalloc_enable=0 /MOUNT_POINT
2. Create the /etc/vx/tunefstab file, and add the entry:
  /dev/vx/dsk/$DISKGROUP/$VOLUME   dalloc_enable=0

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s