Server Monitoring With Icinga On Ubuntu 11.10 - Page 3

Want to support HowtoForge? Become a subscriber!
 
Submitted by falko (Contact Author) (Forums) on Thu, 2012-03-29 14:40. ::

4 Adding A Remote Server (server2) To Icinga

Monitoring localhost is nice, but of course, it would be even better if we could monitor all of our servers in one location. This is possible with Icinga, and this chapter describes how we can add our second Ubuntu 11.10 server (server2.example.com) to the setup.

To do this, we need to install the Nagios NRPE (Nagios Remote Plugin Executor) server on server2, and the Nagios NRPE plugin on server1. The NRPE server will listen on server2; server1 will connect to it using the NRPE plugin and pass commands to it that the NRPE server will execute on server2; it will pass back the results to server1.

First we install the nagios-nrpe-plugin package on server1:

server1:

apt-get install nagios-nrpe-plugin

Nagios web administration password: <-- nagiosadmin_password
Password confirmation: <-- nagiosadmin_password

Now we go to server2:

server2:

Install the nagios-nrpe-server package:

apt-get install nagios-nrpe-server

Now open /etc/nagios/nrpe.cfg:

vi /etc/nagios/nrpe.cfg

We must configure the NRPE server to allow server1 (IP: 192.168.0.100) to connect, therefore we add 192.168.0.100 to the allowed_hosts line:

[...]
# ALLOWED HOST ADDRESSES
# This is an optional comma-delimited list of IP address or hostnames
# that are allowed to talk to the NRPE daemon.
#
# Note: The daemon only does rudimentary checking of the client's IP
# address.  I would highly recommend adding entries in your /etc/hosts.allow
# file to allow only the specified host to connect to the port
# you are running this daemon on.
#
# NOTE: This option is ignored if NRPE is running under either inetd or xinetd

allowed_hosts=127.0.0.1,192.168.0.100
[...]

(If you don't do this, you will get the following error when you run

/usr/lib/nagios/plugins/check_nrpe -H 192.168.0.101

on server1:

root@server1:/etc/nagios-plugins/config# /usr/lib/nagios/plugins/check_nrpe -H 192.168.0.101
CHECK_NRPE: Error - Could not complete SSL handshake.
root@server1:/etc/nagios-plugins/config#

)

Also, server1 needs to be allowed to pass command line arguments to the NRPE server, so still in the same file we set dont_blame_nrpe to 1:

[...]
# COMMAND ARGUMENT PROCESSING
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments to commands that are executed.  This option only works
# if the daemon was configured with the --enable-command-args configure script
# option.
#
# *** ENABLING THIS OPTION IS A SECURITY RISK! ***
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow arguments, 1=allow command arguments

dont_blame_nrpe=1
[...]

(If you don't do this, you will see the error

CHECK_NRPE: Received 0 bytes from daemon. Check the remote server logs for error messages.

for lots of remote service checks in the Icinga web interface, and in /var/log/syslog on server2 you will see these errors:

Aug 23 14:20:20 server2 nrpe[11496]: Error: Request contained command arguments, but argument option is not enabled!
Aug 23 14:20:20 server2 nrpe[11496]: Client request was invalid, bailing out...

)

Finally we must add command definitions for each service check we want to run on server2 and that is not already defined. I want to run the the check_procs, check_all_disks, and check_mysql_cmdlinecred checks on server2; these are not defined in /etc/nagios/nrpe.cfg, so I add them now (I also want to run the check_users and check_load checks, but these are already defined):

[...]
command[check_procs]=/usr/lib/nagios/plugins/check_procs -w 250 -c 400
command[check_all_disks]=/usr/lib/nagios/plugins/check_disk -w '20%' -c '10%' -e
command[check_mysql_cmdlinecred]=/usr/lib/nagios/plugins/check_mysql -H localhost -u 'nagios' -p 'howtoforge'
[...]

(If you don't do this, you will get errors like

NRPE: Command 'check_all_disks' not defined
NRPE: Command 'check_mysql_cmdlinecred' not defined
NRPE: Command 'check_procs' not defined

in the Icinga web interface.)

As you see I have hardcoded the command line arguments because using variables like command[check_procs]=/usr/lib/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ did not work for me. But still, when we configure the service checks for server2 on server1, we will have to pass command line arguments to these checks; server2 will ignore these because I have hardcoded the comand line arguments into /etc/nagios/nrpe.cfg, but if you leave them out, you will get errors like /usr/lib/nagios/plugins/check_nrpe: option requires an argument -- 'a' in the Icinga web interface.

Now save the file and restart the NRPE server:

/etc/init.d/nagios-nrpe-server restart

Now check if the NRPE server is listening:

netstat -tap | grep nrpe

root@server2:~# netstat -tap | grep nrpe
tcp        0      0 *:nrpe                  *:*                     LISTEN       23668/nrpe
root@server2:~#

Now go back to server1...

server1:

... and check if it can connect to the NRPE server on server2:

/usr/lib/nagios/plugins/check_nrpe -H 192.168.0.101

Output should be as follows in case of success:

root@server1:~# /usr/lib/nagios/plugins/check_nrpe -H 192.168.0.101
NRPE v2.12
root@server1:~#

Now go back to server2:

server2:

We want to check MySQL on server2; because we use the NRPE daemon, we can run the check locally on server2, i.e., we don't have to open MySQL to the outside to allow server1 to run the check. Therefore I create the MySQL user nagios for localhost and localhost.localdomain instead of for 192.168.0.100 and server1.example.com:

mysql -u root -p

GRANT USAGE ON *.* TO nagios@localhost IDENTIFIED BY 'howtoforge';
GRANT USAGE ON *.* TO nagios@localhost.localdomain IDENTIFIED BY 'howtoforge';
FLUSH PRIVILEGES;

quit;

Now we go back to server1...

server1:

... and create the Icinga configuration for server2:

vi /etc/icinga/objects/server2_icinga.cfg

define host{
       use generic-host
       host_name server2.example.com
       alias server2
       address 192.168.0.101
}
define service{
       use generic-service
       host_name server2.example.com
       service_description PING
       check_command check_ping!100.0,20%!500.0,60%
}
define service{
       use                             generic-service         ; Name of service template to use
       host_name                       server2.example.com
       service_description             Disk Space
       check_command                   check_nrpe!check_all_disks!20%!10%
}
define service{
       use                             generic-service
       host_name                       server2.example.com
       service_description             Current Users
       check_command                   check_nrpe!check_users!20!50
}
define service{
       use                             generic-service
       host_name                       server2.example.com
       service_description             Total Processes
       check_command                   check_nrpe!check_procs!250!400
}
define service{
        use                             generic-service         ; Name of service template to use
        host_name                       server2.example.com
        service_description             Current Load
        check_command                   check_nrpe!check_load!5.0!4.0!3.0!10.0!6.0!4.0
}
define service{
       use                             generic-service
       host_name                       server2.example.com
       service_description             MySQL
       check_command                   check_nrpe!check_mysql_cmdlinecred!nagios!howtoforge
}
define service{
       use                             generic-service
       host_name                       server2.example.com
       service_description             SMTP
       check_command                   check_smtp
}
define service{
       use                             generic-service
       host_name                       server2.example.com
       service_description             POP3
       check_command                   check_pop
}
define service{
       use                             generic-service
       host_name                       server2.example.com
       service_description             IMAP
       check_command                   check_imap
}

(As I've mentioned before, although I have hardcoded the command line arguments for some commands into /etc/nagios/nrpe.cfg on server2, we still need to add command line arguments to certain these checks here.)

As you see, I use check_nrpe for some checks and pass the actual check (like check_all_disks) as a command line argument to check_nrpe. These are the checks that will be executed locally by the NRPE server on server2. check_nrpe is not needed for all checks. Checks that test a connection from the outside like check_ping or check_smtp can be run from server1.

To check the SSH and HTTP services on server2, we can EITHER add the following stanzas to /etc/icinga/objects/server2_icinga.cfg...

[...]
define service {
        use                             generic-service
        host_name                       server2.example.com
        service_description             SSH
        check_command                   check_ssh
}
define service {
        use                             generic-service
        host_name                       server2.example.com
        service_description             HTTP
        check_command                   check_http
}

... OR we add server2.example.com to the http-servers and ssh-servers hostgroups in /etc/icinga/objects/hostgroups_icinga.cfg:

vi /etc/icinga/objects/hostgroups_icinga.cfg

# Some generic hostgroup definitions

# A simple wildcard hostgroup
define hostgroup {
        hostgroup_name  all
                alias           All Servers
                members         *
        }

# A list of your Debian GNU/Linux servers
define hostgroup {
        hostgroup_name  debian-servers
                alias           Debian GNU/Linux Servers
                members         localhost,server2.example.com
        }

# A list of your web servers
define hostgroup {
        hostgroup_name  http-servers
                alias           HTTP servers
                members         localhost,server2.example.com
        }

# A list of your ssh-accessible servers
define hostgroup {
        hostgroup_name  ssh-servers
                alias           SSH servers
                members         localhost,server2.example.com
        }

Restart Icinga:

/etc/init.d/icinga restart

Afterwards you should find server2 in the Icinga web interface:

Here are the service checks for server2:

If you have added server2 to the hostgroups, it should be listed under Service Overview For All Host Groups as well:

 

5 Links


Please do not use the comment function to ask for help! If you need help, please use our forum.
Comments will be published after administrator approval.
Submitted by Bill (not registered) on Sun, 2012-04-01 23:47.
One thing everyone forgets about is the -n parameter when running plugins (like nrpe). The -n means no ssl. If this is not specified, the server will try to initiate an ssl handshake first. If your clients don't have ssl turned on for NSClient++ or whatever, then hard to track errors occur. I use the -n flag a lot in my environment. All our monitoring happens on a private network with no external access.