How To Monitor A System With Sysstat On Centos 4.3

Want to support HowtoForge? Become a subscriber!
 
Submitted by Quantact-Tim (Contact Author) (Forums) on Fri, 2006-08-18 01:04. :: CentOS | Monitoring

How To Monitor A System With Sysstat On Centos 4.3 

A common task for System Administrators is to monitor and care for a server. That's fairly easy to do at a moment's notice, but how to keep a record of this information over time?  One way to monitor your server is to use the Sysstat package.

Sysstat is actually a collection of utilities designed to collect information about the performance of a linux installation, and record them over time.

It's fairly easy to install too, since it is included as a package on many distributions.

To install on Centos 4.3, just type the following:

yum install sysstat

We now have the sysstat scripts installed on the system. Lets try the sar command.

sar

Linux 2.6.16-xen (xen30)        08/17/2006

11:00:02 AM       CPU     %user     %nice   %system   %iowait     %idle
11:10:01 AM       all      0.00      0.00      0.00      0.00     99.99
Average:          all      0.00      0.00      0.00      0.00     99.99

Several bits of information, such as Linux kernel, hostname, and date are reported.
More importantly, the various ways CPU time being spent on the system is shown.
%user, %nice, %system, %iowait, and %idle describe ways that the CPU may be utilized.
%user and %nice refer to your software programs, such as MySQL or Apache.
%system refers to the kernel’s internal workings.
%iowait is time spent waiting for Input/Output, such as a disk read or write. Finally, since the kernel accounts for 100% of the runnable time it can schedule, any unused time goes into %idle.

The information above is shown for a 1 second interval. How can we keep track of that information over time?
If our system was consistently running heavy in %iowait, we might surmise that a disk was getting overloaded, or going bad.
At least, we would know to investigate.

So how do we track the information over time? We can schedule sar to run at regular intervals, say, every 10 minutes.
We then direct it to send the output to sysstat’s special log files for later reports.
The way to do this is with the Cron daemon.

By creating a file called sysstat in /etc/cron.d, we can tell cron to run sar every day.
Fortunately, the Systat package that yum installed already did this step for us.

more /etc/cron.d/sysstat

# run system activity accounting tool every 10 minutes
*/10 * * * * root /usr/lib/sa/sa1 1 1
# generate a daily summary of process accounting at 23:53
53 23 * * * root /usr/lib/sa/sa2 -A

The sa1 script logs sar output into sysstat’s binary log file format, and sa2 reports it back in human readable format.
The report is written to a file in /var/log/sa.

ls /var/log/sa

sa17  sar17

sa17 is the binary sysstat log, sar17 is the report. (Today’s date is the 17th)

There is quite alot of information contained in the sar report, but there are a few values that can tell us how busy the server is.
Values to watch are swap usage, disk IO wait, and the run queue.
These can be obtained by running sar manually, which will report on those values.

sar

Linux 2.6.16-xen (xen30)        08/17/2006

11:00:02 AM       CPU     %user     %nice   %system   %iowait     %idle
11:10:01 AM       all      0.00      0.00      0.00      0.00     99.99
11:20:01 AM       all      0.00      0.00      0.00      0.00    100.00
11:30:02 AM       all      0.01      0.26      0.19      1.85     97.68
11:39:20 AM       all      0.00      2.41      2.77      0.53     94.28
11:40:01 AM       all      1.42      0.00      0.18      3.24     95.15
Average:          all      0.03      0.62      0.69      0.64     98.02

There were a few moments where of disk activity was high in the %iowait column, but it didnt stay that way for too long. An average of 0.64 is pretty good.

How about my swap usage, am I running out of Ram? Being swapped out is normal for the Linux kernel, which will swap from time to time. Constant swapping is bad, and generally means you need more Ram.

sar -W

Linux 2.6.16-xen (xen30)        08/17/2006

11:00:02 AM  pswpin/s pswpout/s
11:10:01 AM      0.00      0.00
11:20:01 AM      0.00      0.00
11:30:02 AM      0.00      0.00
11:39:20 AM      0.00      0.00
11:40:01 AM      0.00      0.00
11:50:01 AM      0.00      0.00
Average:         0.00      0.00

Nope, we are looking good. No persistant swapping has taken place.

How about system load? Are my processes waiting too long to run on the CPU?

sar -q

Linux 2.6.16-xen (xen30)        08/17/2006

11:00:02 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15
11:10:01 AM         0        47      0.00      0.00      0.00
11:20:01 AM         0        47      0.00      0.00      0.00
11:30:02 AM         0        47      0.28      0.21      0.08
11:39:20 AM         0        45      0.01      0.24      0.17
11:40:01 AM         0        46      0.07      0.22      0.17
11:50:01 AM         0        46      0.00      0.02      0.07
Average:            0        46      0.06      0.12      0.08

No, an average load of .06 is really good.
Notice that there is a 1, 5, and 15 minute interval on the right.
Having the three time intervals gives you a feel for how much load the system is carrying.
A 3 or 4 in the 1 minute average is ok, but the same number in the 15 minute
column may indicate that work is not clearing out, and that a closer look is warranted.

This was a short look at the Sysstat package.

We only looked at the out put of three of sar’s attributes, but there are others.
Now, armed with sar in your toolbox, your system administration job just became a little easier.


Please do not use the comment function to ask for help! If you need help, please use our forum.
Comments will be published after administrator approval.
Submitted by Charles Stepp (not registered) on Tue, 2014-10-07 21:17.

In the crontab entry, you should not be limiting the interval to 1 second. Sar uses the same system resources no matter how long the interval is. It reads kernel values, sleeps, reads the values again and records/prints the difference value. 1 second, 10 seconds, 1200 seconds are the same as far as sar’s resource usage. 99.99% of sar’s usage is sleep, which is what the kernel does anyway when it’s not doing anything. Note below that the first sar sample of only a second showed an average cpu of 3%. The longer samples, averaging over a longer period, show that 6% is probably more of an accurate average, at this time. The web pages I’ve seen so far feed each other with this 1 second sample thing, almost like someone is afraid sar might bog the system down. It won’t. The same two sets of kernel reads happens no matter what the interval is:<br><br>

time sar 1 1; time sar 10 1; time sar 100 1<br>
Linux 2.6.18-194.el5 (blahblah) 10/07/14<br><br>

12:04:51 CPU %user %nice %system %iowait %steal %idle<br>
12:04:52 all 3.00 0.00 0.75 0.00 0.00 96.25<br>
Average: all 3.00 0.00 0.75 0.00 0.00 96.25<br>
sar 1 1 0.00s user 0.00s system 0% cpu 1.005 total<br>
Linux 2.6.18-194.el5 (blahblah) 10/07/14<br><br>

12:04:52 CPU %user %nice %system %iowait %steal %idle<br>
12:05:02 all 6.21 0.00 0.93 0.20 0.00 92.67<br>
Average: all 6.21 0.00 0.93 0.20 0.00 92.67<br>
sar 10 1 0.00s user 0.00s system 0% cpu 10.005 total<br>
Linux 2.6.18-194.el5 (blahblah) 10/07/14<br><br>

12:05:02 CPU %user %nice %system %iowait %steal %idle<br>
12:06:42 all 6.32 0.00 0.97 0.24 0.00 92.47<br>
Average: all 6.32 0.00 0.97 0.24 0.00 92.47<br>
sar 100 1 0.00s user 0.00s system 0% cpu 1:40.01 total<br><br>

From the man page example it shows each hour having 3 20 minute samples. This provides accurate averaging and small sa## files. A 1 second interval each 10 minutes is 1/600th of the information available.<br><br>

EXAMPLES<br>
To create a daily record of sar activities, place the following entry in your root or adm crontab file:<br>

0 8-18 * * 1-5 /usr/lib/sa/sa1 1200 3 &

Submitted by Anonymous (not registered) on Wed, 2006-08-23 17:20.
you may monitor all these data and even more with dim_STAT tool (http://dimitrik.free.fr), as well analyze several hosts on the same time, produce professional reports, etc. This tool is free and run on most of distro...
-sys
Submitted by Anonymous (not registered) on Tue, 2006-08-22 08:05.
Thank you for the article. I learned a lot.
Submitted by Anonymous (not registered) on Mon, 2006-08-21 23:48.
This is a particularly good monitoring tool to install. Let it run for at least a full day, and at the end of a day, run sar -A | less You will be amazed at how many system resources are being monitored. Also, it is an "old school" tool from the commercial UNIX world, and has earned the respect of many gray-bearded admins. Finally, O'Reilly's "Swordfish book", System Performance Tuning, gives a great explanation on what useful data you can get from sar, and what it is trying to tell you about the system health. The book is a bit dated now, but there's still plenty of great stuff in there. Nice to see someone pushing this tool, as it is usually one of the first packages I install after a clean build. Also check out http://www-128.ibm.com/developerworks/eserver/library/es-unix-perfmonsar.html
Submitted by Anonymous (not registered) on Fri, 2006-08-18 17:10.

Hi,

 I don't think you need to create cronjobs, as they are created automatically at the install of the package.

 Good article, though!

Submitted by AlexCunha (registered user) on Wed, 2006-08-30 03:35.

Normally I have this in my cronjob:
0 23 * * * /usr/bin/sar -q -r | /bin/mail -s "$HOSTNAME Daily_Sar_Report" alert@yourdomain.com






Planet Malaysia Aggregates - Read Malaysian Blog, Read Planet Malaysia







Submitted by shree (registered user) on Sat, 2006-08-19 14:51.

Yep, that was in there, just tucked away:

>By creating a file called sysstat in /etc/cron.d, we can tell cron to run sar every day.
>Fortunately, the Systat package that yum installed already did this step for us.

 

Timothy Doyle

CEO

Quantact Hosting Solutions, Inc.

http://www.quantact.com