Linux performance tuning

From Strugglers
Revision as of 22:49, 2 February 2006 by Andy (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This article was originally a post by myself to HantsLUG. I thought it might be generally interesting so decided to put it here.

Simple performance tuning tips

> I don't know if it is IO bound and don't really know how to test
> such things. I just know access isn't as snappy as it used to
> be.

When the system is running fast/normally you need to take a baseline. Look at top, see what your percentage system, user, idle and iowait is. These can be seen here:

Cpu(s):  8.6% us,  0.3% sy,  0.0% ni, 90.5% id,  0.6% wa,  0.0% hi, 0.0% si
us 
percent cpu being used by userland code
sy 
percent cpu being used by kernelspace code
ni 
like "us" but related to "niced" processes
id 
idle
wa 
cpu is idle because it waits for IO to complete
hi 
interrupts generated by hardware
si 
interrupts generated by software

Get a feel for how this looks when your system is running well. In an ideal world you want to see a load average below 1 with "wa" below 5% although what is considered "normal" varies from server to server.

Then look at more detailed vm subsystem stats, again while system is performing well. For this you can use the vmstat command.

vmstat 5 will print an initial header and stats-to-date line and then an additional line every 5 seconds with differential stats. Leave this going for a while so you have a good idea what this normally looks like. e.g.:

$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 2  0  94348  13440  13428  60304    0    0     2     2    2     1  9  0 90  1
 0  0  94348  13456  13452  60304    0    0     0    86   46    70  0  0 100  0
 0  0  94348  13464  13460  60304    0    0     0     5   30    65  0  0 100  0
 0  0  94348  13472  13472  60304    0    0     0    17   32    66  0  0 100  0
 0  0  94348  13472  13480  60304    0    0     0     3   30    64  0  0 100  0
 0  0  94348  13472  13496  60304    0    0     0    17   37    68  0  0 100  0
 0  0  94348  13472  13512  60304    0    0     0    18   33    67  0  0 100  0
r  
processes ready to run but are waiting for the cpu
b  
processes in uninterruptible sleep (often io-related)
swpd  
swap used (KiB)
free  
memory doing nothing (KiB)
buff  
buffers (KiB)
cache 
disk cache (KiB)
si  
memory swapped in from disk (KiB/s)
so  
memory swapped out from disk (KiB/s)
bi  
Blocks read in from block devices (blocks/sec)
bo  
Blocks written out to block devices (blocks/sec)
in  
interrupts per second (hardware, software, e.g. "the clock ticked" "the ethernet card got a packet")
cs  
context switches per second (one process stops running on CPU, another starts)
us  
Userland CPU usage %
sy  
Kernelspace CPU usage %
id  
Idle CPU %
wa  
IO wait CPU %

Ignore the first line. You are looking for "r" and "b" to be low, certainly single digits, preferably zero. If they're not then processes generally are lacking resources, usually CPU or disk.

Amount of swap ("swpd") is not as important as swap churn ("si", "so") which should be close to zero.

The remaining io/system/cpu figures can give you an idea of how busy your machine normally is in terms of CPU and disk.

Now repeat that when your system is running slow. By looking at what is different you can narrow down performance bottlenecks.

e.g. top says load is high, "wa" is low but "us" is high and this goes on for long periods and coincides with poor performance. Therefore your processes are CPU starved, give the machine more CPU (or run better code!) and things will improve (or you'll find a different bottleneck!)

e.g. top says load is high, "us" is low, "wa" is above 75% for a long time: your processes are waiting for IO, could be network or disk or various other things, but is most likely disk.

Well, OK so you find out your system is starved for IO, but that's a bit vague. You need to know what is wanting the IO. This can be a bit tricky but something that can really help is iostat. iostat -x 5 will show you the IO to every partition on your system updated every 5 seconds:

(output from a RHEL box as none of my play machines have significant disk IO; 2.6 kernel output may be slightly different)

$ iostat -x 5
Linux 2.4.21-37.ELhugemem (xxxpdb02)    02/02/06

avg-cpu:  %user   %nice    %sys %iowait   %idle
           7.78    0.00    8.98   30.02   53.22

Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda          2.76   9.67  8.56 18.87   90.36  228.49    45.18   114.25    11.62     0.00    0.07   0.19   0.53
sda1         0.03   0.00  0.00  0.00    0.06    0.00     0.03     0.00    15.54     0.00   23.64   2.36   0.00
sda2         0.22   0.89  0.34  0.66    4.45   12.37     2.23     6.18    16.83     0.00    3.42   1.48   0.15
sda3         0.00   7.78  6.50 17.42   51.97  201.71    25.98   100.86    10.61     0.00    0.34   0.01   0.02
sda5         2.21   0.99  1.46  0.78   29.38   14.11    14.69     7.06    19.40     0.00    2.65   1.81   0.41
sda6         0.16   0.02  0.17  0.01    2.61    0.24     1.30     0.12    15.97     0.00   19.21   4.77   0.09
sda7         0.00   0.00  0.00  0.00    0.00    0.04     0.00     0.02    19.35     0.00    3.50   3.09   0.00
sda8         0.15   0.00  0.09  0.00    1.89    0.00     0.94     0.00    20.99     0.00   23.10   2.38   0.02
sdb          4.26   1.50  2.90  1.84   57.29   26.77    28.64    13.38    17.75     0.01    0.99   0.76   0.36
sdb1         4.26   1.50  2.90  1.84   57.29   26.77    28.64    13.38    17.75     0.01    0.99   0.76   0.36
sdc          2.67  14.97  1.96 13.93   36.97  231.19    18.49   115.59    16.88     0.00    0.18   0.18   0.29
sdc1         2.67  14.97  1.96 13.93   36.97  231.19    18.49   115.59    16.88     0.00    0.18   0.18   0.29
sdd         38.75   2.08 29.24  1.63   12.44   29.70     6.22    14.85     1.37     0.00    0.14   0.03   0.10
sdd1        38.75   2.08 29.24  1.63   12.44   29.70     6.22    14.85     1.37     0.00    0.14   0.03   0.10
sde          3.04   4.42  1.86  2.93   39.23   58.79    19.62    29.40    20.44     0.00    1.73   0.49   0.24
sde1         3.04   4.42  1.86  2.93   39.23   58.79    19.62    29.40    20.44     0.00    1.73   0.49   0.24
sdf         12.61   1.67  9.51  2.72  176.94   35.18    88.47    17.59    17.34     0.00    0.50   0.33   0.41
sdf1        12.61   1.67  9.51  2.72  176.94   35.18    88.47    17.59    17.34     0.00    0.50   0.33   0.41
sdg         22.09   6.71 19.70  6.03  334.32  101.95   167.16    50.98    16.96     0.00    0.24   0.18   0.48
sdg1        22.09   6.71 19.70  6.03  334.32  101.95   167.16    50.98    16.96     0.00    0.24   0.18   0.48
sdh         17.39   4.64 10.43  3.94  222.60   68.70   111.30    34.35    20.26     0.00    0.66   0.01   0.01
sdh1        17.39   4.64 10.43  3.94  222.60   68.70   111.30    34.35    20.26     0.00    0.66   0.01   0.01
sdi         25.18   1.21 23.47  1.35  389.22   20.46   194.61    10.23    16.51     0.00    0.20   0.05   0.14
sdi1        25.18   1.21 23.47  1.35  389.22   20.46   194.61    10.23    16.51     0.00    0.20   0.05   0.14
sdj         22.61   7.75 19.22  9.94  334.59  141.48   167.29    70.74    16.33     0.00    0.10   0.03   0.10
sdj1        22.61   7.75 19.22  9.94  334.59  141.48   167.29    70.74    16.33     0.00    0.10   0.03   0.10
sdk          9.67   2.77  5.95  1.51  124.83   34.25    62.41    17.12    21.34     0.00    0.79   0.60   0.45
sdk1         9.65   2.77  5.95  1.51  124.78   34.25    62.39    17.12    21.33     0.00    0.79   0.60   0.45
sdl         24.37   3.64 16.00  0.90  322.99   33.88   161.50    16.94    21.11     0.00    0.25   0.21   0.36
sdl1        24.37   3.64 16.00  0.90  322.99   33.88   161.50    16.94    21.11     0.00    0.25   0.21   0.36

avg-cpu:  %user   %nice    %sys %iowait   %idle
           9.72    0.00    8.69   58.08   23.52

Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda          0.20  30.80 14.00 204.20  113.60 1891.20    56.80   945.60     9.19     1.37    6.26   1.19  26.00
sda1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2         0.20   3.00  0.60  0.60    6.40   28.80     3.20    14.40    29.33     0.01    6.67   6.67   0.80
sda3         0.00  26.80 13.40 202.80  107.20 1848.00    53.60   924.00     9.04     1.36    6.28   1.17  25.20
sda5         0.00   1.00  0.00  0.80    0.00   14.40     0.00     7.20    18.00     0.00    0.00   0.00   0.00
sda6         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda7         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda8         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb          0.00   0.20  0.00  0.60    0.00    6.40     0.00     3.20    10.67     0.00    0.00   0.00   0.00
sdb1         0.00   0.20  0.00  0.60    0.00    6.40     0.00     3.20    10.67     0.00    0.00   0.00   0.00
sdc          0.00   0.80  0.00  0.60    0.00   11.20     0.00     5.60    18.67     0.00    0.00   0.00   0.00
sdc1         0.00   0.80  0.00  0.60    0.00   11.20     0.00     5.60    18.67     0.00    0.00   0.00   0.00
sdd         34.20   1.60 39.60  2.20  593.60   30.40   296.80    15.20    14.93     1.14   27.18  12.11  50.60
sdd1        34.20   1.60 39.60  2.20  593.60   30.40   296.80    15.20    14.93     1.14   27.18  12.11  50.60
sde          0.00   0.80  0.00  0.60    0.00   11.20     0.00     5.60    18.67     0.00    0.00   0.00   0.00
sde1         0.00   0.80  0.00  0.60    0.00   11.20     0.00     5.60    18.67     0.00    0.00   0.00   0.00
sdf          4.40   0.00  3.80  0.20   65.60    1.60    32.80     0.80    16.80     0.11   27.00  14.00   5.60
sdf1         4.40   0.00  3.80  0.20   65.60    1.60    32.80     0.80    16.80     0.11   27.00  14.00   5.60
sdg         87.40   0.20 107.80  0.40 1563.20    4.80   781.60     2.40    14.49     2.98   27.52   7.76  84.00
sdg1        87.40   0.20 107.80  0.40 1563.20    4.80   781.60     2.40    14.49     2.98   27.52   7.76  84.00
sdh          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdh1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdi        354.00   0.20 357.40  0.40 5686.40    4.80  2843.20     2.40    15.91     3.33    9.30   2.66  95.20
sdi1       354.00   0.20 357.40  0.40 5686.40    4.80  2843.20     2.40    15.91     3.33    9.30   2.66  95.20
sdj          8.60   0.40  8.80  7.40  139.20   62.40    69.60    31.20    12.44     0.07    4.07   2.59   4.20
sdj1         8.60   0.40  8.80  7.40  139.20   62.40    69.60    31.20    12.44     0.07    4.07   2.59   4.20
sdk          1.20   0.20  0.40  1.40   12.80   12.80     6.40     6.40    14.22     0.01    3.33   3.33   0.60
sdk1         1.20   0.20  0.40  1.40   12.80   12.80     6.40     6.40    14.22     0.01    3.33   3.33   0.60
sdl          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdl1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice    %sys %iowait   %idle
           7.47    0.00    6.30   44.70   41.52

Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda          0.00  49.00 23.20 152.00  187.20 1601.60    93.60   800.80    10.21     0.97    5.57   1.97  34.60
sda1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2         0.00   0.20  0.00  0.40    0.00    4.80     0.00     2.40    12.00     0.00    0.00   0.00   0.00
sda3         0.00  47.80 23.20 150.80  187.20 1582.40    93.60   791.20    10.17     0.97    5.61   1.99  34.60
sda5         0.00   1.00  0.00  0.80    0.00   14.40     0.00     7.20    18.00     0.00    0.00   0.00   0.00
sda6         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda7         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda8         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc          0.00   1.40  0.00  0.80    0.00   17.60     0.00     8.80    22.00     0.00    0.00   0.00   0.00
sdc1         0.00   1.40  0.00  0.80    0.00   17.60     0.00     8.80    22.00     0.00    0.00   0.00   0.00
sdd         52.00   1.40 47.20  0.80  800.00   17.60   400.00     8.80    17.03     0.95   19.62   9.58  46.00
sdd1        52.00   1.40 47.20  0.80  800.00   17.60   400.00     8.80    17.03     0.95   19.62   9.58  46.00
sde          0.00   1.40  0.00  0.80    0.00   17.60     0.00     8.80    22.00     0.00    0.00   0.00   0.00
sde1         0.00   1.40  0.00  0.80    0.00   17.60     0.00     8.80    22.00     0.00    0.00   0.00   0.00
sdf          1.80   0.00  1.80  0.00   28.80    0.00    14.40     0.00    16.00     0.04   21.11   8.89   1.60
sdf1         1.80   0.00  1.80  0.00   28.80    0.00    14.40     0.00    16.00     0.04   21.11   8.89   1.60
sdg         63.20   0.00 76.60  0.00 1121.60    0.00   560.80     0.00    14.64     1.68   21.72   6.71  51.40
sdg1        63.20   0.00 76.60  0.00 1121.60    0.00   560.80     0.00    14.64     1.68   21.72   6.71  51.40
sdh          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdh1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdi        289.40   0.00 346.80  0.00 5089.60    0.00  2544.80     0.00    14.68     3.03    8.73   2.74  95.00
sdi1       289.40   0.00 346.80  0.00 5089.60    0.00  2544.80     0.00    14.68     3.03    8.73   2.74  95.00
sdj          4.80   4.00 24.40 48.80  235.20  422.40   117.60   211.20     8.98     0.09    1.17   0.71   5.20
sdj1         4.80   4.00 24.40 48.80  235.20  422.40   117.60   211.20     8.98     0.09    1.17   0.71   5.20
sdk          0.20   0.20  1.00  0.40    9.60    4.80     4.80     2.40    10.29     0.01   10.00   2.86   0.40
sdk1         0.20   0.20  1.00  0.40    9.60    4.80     4.80     2.40    10.29     0.01   10.00   2.86   0.40
sdl          0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdl1         0.00   0.00  0.00  0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Don't be scared by this mass of info! You can read the man page for iostat to see what all of the columns are, but for simple diagnostics you only need to look at a few things. Again ignore the first set of results since that's the total since system start.

Device 
which block device we're talking about
rkB/s  
KiB/s read from device
wkB/s  
KiB/s written to device
await  
Average time, in ms, that requests completed in the last interval (i.e. 5 secs) took to be completed, including how long they waited and how long they were actually being serviced
%util  
Percentage of the processes' CPU time that was spent dealing with IO related to this device. When this gets close to 100 then the device is doing all it can, basically.

From the above you can see that /dev/sdi1 is saturated, /dev/sdg1 isn't too happy either, and that's the bottleneck on this server. In both cases it is reading rather than writing.

To try and track down which processes are doing the IO you can do a ps awux and look for processes with a "D" in the "STAT" column. This indicates uninterruptible sleep, usually IO-related. If you are lucky you will also be able to use lsof -p to see exactly what they are reading/writing from/to, but usually iostat was specific enough.

Once you know which block devices are suffering, the answer is to buy more/faster disks and put those filesystems on them. Use SoftwareRAID to increase performance and maintain redundancy which is essential with multiple disks since adding more disks is multiplying chances of device failure. See Linux RAID best practices for an existing diatribe on this subject.

Basic performance diagnostics are really useful even in a home situation where you just want to know where to best spend 200 quid.