3PAR Performance Monitoring – CPU

Overview

In the previous post we covered Cache and will now look at monitoring the cpu performance of the 3PAR Array. Metrics are gathered using the statcpu command.

  • Cache 
    • statcmp
  • Cpu
    • statcpu
  • Hosts
    • statvlun / statvlun -hostsum
  • Ports
    • statport -host
  • Volumes
    • statvlun -vvsum

Getting started..

How do I run the statcpu command? Below shows how to setup a passfile so we dont need to input a password for our commands.

# PATH=$PATH:/opt/hp_3par_cli/bin/
# setpassword -saveonly -file mypass.your3par 
system: your3par 
user:  
password:

CPU

CPU performance metrics are well known across many system types and the 3PAR is no different we are given access to the Node, CPU, User Time, System Time, Idle and an overall Interrupts and Context Switches per node.

Statcpu command output

Below we have some example output which has been truncated to a single node.

EXAMPLES
  The following example displays an iteration of CPU statistics for
  a single node:

  %cli statcpu -iter 1 -d 15
  21:00:00 11/21/2017
  node,cpu user sys idle intr/s ctxt/s
     0,0    0  90   10
     0,1    0  54   46
     0,2    0  58   42
     0,3    0  49   51
     0,4    0  52   48
     0,5    0  44   56
     0,6    0  81   19
     0,7    0  42   58
     0,8    0  45   55
     0,9    0  51   49
    0,10    0  38   62
    0,11    0  36   64
    0,12    0  36   64
    0,13    0  66   34
    0,14    0  31   69
    0,15    0  88   12
    0,16    0  33   67
    0,17    0  31   69
    0,18    0  30   70
    0,19    0  30   70
    0,20    0  39   61
    0,21    0  31   69
    0,22    0  63   37
    0,23    0  32   68
    0,24    0  86   14
    0,25    0  28   72
    0,26    0  28   72
    0,27    0  44   56
    0,28    0  60   40
    0,29    0  38   62
    0,30    0  34   66
    0,31    0  74   26
 0,total    0  48   52 288010 298043

Great so lets explain the output.

  • node,cpu
    • The Node in the Array and each CPU available
  • user
    • % of time spent in User processing
  • sys
    • % of time spent in System processing (kernel/syscalls)
  • intr/s
    • Number of interrupts/second during the sample period
  • ctxt/s
    • Number of Context Switches/sec during the sample period
  • total
    • Overall average of CPU usage in each column for the sample period.

Looking at the overall node you can get a feel for how busy the CPU is by using the total field and tracking this if you start to see an increase in system time we would need to look at what the array is doing, this could be garbage collection task etc.

We also have per CPU metrics which is also great so we can look at each CPU and also check if we have some CPU’s running hot if we experience issues. In the above example CPU 0 has been running at 90% in system processing over the last 15 seconds.

** What to track %sys > 60% of a Node ** Also look for Balance of metrics if we have a node more busy than others we may have a multi-path issue.

The Parsing Script

BEGIN {
  printf("%8s %4s %4s %4s %4s %4s %12s %12s \n","Time","Node","Cpu","User","Sys","Util","intr/s","ctxt/s");
}
{
if ($1 ~ /[0-9][0-9]:/) {time=$1}
if ($1 ~ /total/) {
   split($1,ncpu,",")
   printf("%8s %4d %4d %4d %4d %4d %12d %12d\n",time,ncpu[1],ncpu[2],$2,$3,100-$4,$5,$6);
}
}

We can also use this script with a file by doing the following.

# awk -f ./awk_cpu statpu.out |more
    Time Node  Cpu User  Sys Util       intr/s       ctxt/s
00:00:27    0    0    0   48   48       288010       298043
00:00:27    1    0    1   48   49       288948       297952
00:00:27    2    0    8   49   57       304743       295164
00:00:27    3    0    4   51   55       304167       313328
00:00:42    0    0    1   52   53       287327       280944
00:00:42    1    0    1   51   52       286144       270951
00:00:42    2    0    8   52   59       301698       280315
00:00:42    3    0    6   52   58       307004       288337
00:00:58    0    0    1   48   49       296359       306778

Or in realtime

# statcpu -pwf ./mypass.mypar -sys testpar -iter 5 -d 1 |awk -f ./awk_cpu
    Time Node  Cpu User  Sys Util       intr/s       ctxt/s
22:20:14    0    0    0   11   11       133028       149000
22:20:14    1    0    0   12   12       150715       164091
22:20:14    2    0    7   10   17       145678       167100
22:20:14    3    0    0   12   12       149124       165820
22:20:15    0    0    0   11   11       145646       160334
22:20:15    1    0    0   15   16       161879       189288
22:20:15    2    0    6   12   19       148670       177994
22:20:15    3    0    0   16   16       152063       169221
22:20:16    0    0    0   14   14       158968       174704
22:20:16    1    0    0   15   16       165782       206909
22:20:16    2    0    6   14   21       167643       206980
22:20:16    3    0    0   13   13       145107       172812

We mentioned above that we can look at each CPU on the system and see what that looks like. I will cover this topic off when we look to build the main view for Grafana for the overall Array view. However here is an example of why we want to gather this data.


Looking above at Node 3 (Orange) and Node 1(Yellow) we have a spike in %sys at 2pm lets take a look to see if we can see whats causing the spike.

Here we have selected just the ports on Node 3 and we can see we have a port which is busy servicing write IOPS at the time of the %sys spike.

In next post I will cover the metrics for Hosts and Volumes. I will show two ways of gathering the metrics and the pros and cons of each.

2 thoughts on “3PAR Performance Monitoring – CPU

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s