SolidFire and Quality of Service – Part 1

The SolidFire array is primarily used as an ISCSI target and allows for quality of service on a per volume basis. You can read more about it here. The purpose of these posts are to talk about the QOS functionality and how we can look to monitor this on the Arrays.

Some Background..

Recently we ran into an issue where a service was running slow and when we checked to determine if the volume was being impacted by QOS and we initially missed that it actually was.

Normally this would not be something difficult to check as the majority of Array based QOS implementations are rate limiting based on IOPS, BW or a combination of both. To check you would normally look at the QOS policy and see if the limits are being hit.

The SolidFire QOS implementation however is not a classic rate limit and is dynamic based on the I/O size of the data being issued at any point in time against the Min/Max/Burst IOPS settings in place.  So checking general metrics such as IOPS/BW we initially didn’t really see anything that jumped out saying look your at the QOS limit here…. based on your I/O size..

What we normally see when we setup a volume for QOS an example is shown below where we set a min/max and burst for IOPS.

QOS_SETTINGS

We can see above we have a lookup table for the I/O Size and the appropriate QOS min/max/burst per volume. However this only gives us 4,8,16,262!! as a guide ( should be 256 looks like a typo in the code)

So what does this mean? Below is an extract from the SolidFire Element OS guide (P53/54) which puts this better than I could try to explain.

  • “Min IOPS: The minimum number of sustained inputs and outputs per second (IOPS) that the SolidFire cluster provides to a volume. The Min IOPS configured for a volume is the guaranteed level of performance for a volume. Performance does not drop below this level. 
  • Max IOPS: The maximum number of sustained IOPS that the SolidFire cluster provides to a volume. When cluster IOPS levels are critically high, this level of IOPS performance is not exceeded.
  • Burst IOPS: The maximum number of IOPS allowed in a short burst scenario. If a volume has been running below the Max IOPS, burst credits are accumulated. When performance levels become very high and are pushed to maximum levels, short bursts of IOPS are allowed on the volume.

SolidFire uses Burst IOPS when a cluster is running in a state of low cluster IOPS utilization. A single volume can accrue Burst IOPS and use the credits to burst above their Max IOPS up to their Burst IOPS level for a set “burst period”. A volume can burst for up to 60 seconds if the cluster has the capacity to accommodate the burst. A volume accrues one second of burst credit (up to a maximum of 60 seconds) for every second that the volume runs below its Max IOPS limit.

Burst IOPS are limited in two ways:

  1. A volume can burst above its Max IOPS for a number of seconds equal to the number of burst credits that the volume has accrued.
  2. When a volume is bursting above its Max IOPS setting, it is limited by its Burst IOPS setting. Therefore, the burst IOPS never exceeds the burst IOPS setting for the volume.
  • Effective Max Bandwidth: The maximum bandwidth is calculated by multiplying the number of IOPS (based on the QoS curve) by the IO size

So from the above view you can see if my application was to issue 256Kb IOPS then the maximum I could get is 385 IOPS with a burst to 513 IOPS which equates to 96Mb/sec & 128MB/sec.

The QOS Curve.

The SolidFire normalizes block sizes to 4Kb to determine the IOPS level and as the block size increases so does throughput however IOPS decrease.

So how do we know what the value of IOPS and BW are for a 12Kb block size or maybe even a 64Kb block size. Would it be fair to say if we have a Min IOPS of 1111 at 16Kb we half this for 32Kb? Well the answer is NO, we need to work out the cost of the I/O and measure this against the min/max/burst IOPS.

qos_curve

Above we can see that for 32Kb we have a cost of 500 a min IOPS of 600 a Max of 3000 and a burst of 4000.  You can also see that we can also show the throughput based on the I/O size of IOPS * I/O Size.

What does the I/O size and QOS look like for say any I/O size up to 32KB based on a min/max/burst of 3000/15000/20000 IOPS.

1st 32kb

In the next post I will cover off how we worked out the above data based on the QOS curve and the tooling developed to monitor volume QOS in realtime.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s