The Perfect Load-Balanced & High-Availability Web Cluster With 2 Servers Running Xen On Ubuntu 8.04 Hardy Heron - Page 9

15. Custom scripts for monitoring (lb1, lb2, web1, web2)

I made a few bash script to monitor the whole setup (they are a bit ugly but they work). If you make them better, feel free to mail them to me!

 

15.1 Monitoring from lb1.example.com

First we must install sendmail so lb1.example.com will be able to send mail :

apt-get install sendmail

The first script will check if the backup load balancer (lb2.example.com) is still available to takeover :

vi /root/lb2_check

#!/bin/bash
# Backup load balancer check
# Copyright (c) 2008 blogama.org
# This script is licensed under GNU GPL version 2.0 or above
# ---------------------------------------------------------------------
### This script does 1 verification ###
### 1) Check if backup load balancer failed and send mail notification ###
### To be modified ###
EMAIL="[email protected]"
###### Do not make modifications below ######
### Binaries ###
MAIL=$(which mail)
### To restore to original when problem fixed ###
if [ $1 ]; then
  if [ $1=="fix" ]; then
    rm /root/lb2_problem.txt
    > /var/log/ha-log
    exit 1;
  fi
fi
### Check if already notified ###
cd /root
if [ -f lb2_problem.txt ]; then
  exit 1;
fi
### Check if Heartbeat is running on hot standby ###
tail /var/log/ha-log 2>&1 | grep "Asking other side for ping node count"
if [ "$?" -ne "1" ]; then
  echo "Backup load balancer failed" > /root/lb2_problem.txt
  $MAIL -s "Backup load balancer problem" $EMAIL < /root/lb2_problem.txt
fi

We make this script executable :

chmod +x /root/lb2_check

If the lb2.example.com fails, then it will create a file /root/lb2_problem.txt and send a mail notification. Until the file lb2_problem.txt is there, it won't check again. Also we must empty the log file once the problem is fixed for the script to work properly.

Once the problem is fixed on lb2.example.com, please manually run :

/root/lb2_check fix

The next script will check if any ports failed on either web1 or web2 by checking the ldirectord log file. There is already a mail notification with ldirectord but it sends millions of notification, mine only send one until you fix the problem :

vi /root/ports_failed

and make it look like this :

#!/bin/bash
# Ldirectord ports failure check
# Copyright (c) 2008 blogama.org
# This script is licensed under GNU GPL version 2.0 or above
# ---------------------------------------------------------------------
### This script does 1 verification ###
### 1) Check for port failure on load balanced servers ###
### To be modified ###
EMAIL="[email protected]"
###### Do not make modifications below ######
### Binaries ###
MAIL=$(which mail)

#to restore to original when problem fixed
if [ $1 ]; then
  if [ $1=="fix" ]; then
    rm /root/port_problem.txt
    > /var/log/ldirectord.log
  fi
fi
###check if already notified###
cd /root
if [ -f port_problem.txt ]; then
  cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log
  exit 1;
fi
### Check if port failed ###
cat /var/log/ldirectord.log 2>&1 | grep Deleted
if [ "$?" -ne "1" ]; then
  cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log
  cat "Ports problem see logfile /var/log/port_problem.log" > /root/port_problem.txt
  $MAIL -s "Some ports failed" $EMAIL < /root/port_problem.txt
fi

We make it executable :

chmod +x /root/ports_failed

This is the same as the first script, once the problem is fixed you must run :

/root/ports_failed fix

in order to make the script running again.

Now add both scripts to your crontab :

crontab -e

* * * * * /root/ports_failed  >/dev/null 2>&1
* * * * * /root/lb2_check  >/dev/null 2>&1

 

15.2 Monitoring from lb2.example.com

Monitoring the second load balancer is important because it will tell us if the master load balancer failed and if it did, keep an eye for ports failure on web1 and web2.

First we must install sendmail so lb2.example.com will be able to send mail :

apt-get install sendmail

vi /root/ports_check

And paste this script :
#!/bin/bash
# Ldirectord ports failure check
# Copyright (c) 2008 blogama.org
# This script is licensed under GNU GPL version 2.0 or above
# ---------------------------------------------------------------------

### This script does 2 verifications ###
### 1) check if master load balancer failed and send mail notification ###
### 2) If master load balancer failed, check for port failure on load balanced servers ###
### To be modified ###
EMAIL="[email protected]"
###### Do not make modifications below ######
### Binaries ###
MAIL=$(which mail)
### Date ###
NOW=$(date)
### To restore to original when problem fixed ###
if [ $1 ]; then
  cd /root/
  if [ $1=="fix" ]; then
  
    if [ -f lb1_problem.txt ]; then
      rm /root/lb1_problem.txt
    fi
  
    if [ -f port_problem.txt ]; then
      rm /root/port_problem.txt
    fi
  
    if [ -f /root/server_problem_notified.txt ]; then
      rm /root/server_problem_notified.txt
    fi
  > /var/log/ldirectord.log
  > /var/log/ha-log
  exit 1;
  fi
fi
#check if ldirectord is running on lb2.example.com (means that lb1.example.com failed)
#$LDIRECTORD /etc/ha.d/ldirectord.cf status 2>&1 | grep running
cat /var/log/ha-log | grep "takeover complete" > /dev/null 2>&1
if [ "$?" -ne "1" ]; then
  ###check if already notified###
  cd /root
  if [ -f port_problem.txt ]; then
    cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log
    exit 1;
  fi
  ### Check if port failed ###
  cat /var/log/ldirectord.log 2>&1 | grep Deleted
  if [ "$?" -ne "1" ]; then
    cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log
    echo "Ports problem see logfile /var/log/port_problem.log" > /root/port_problem.txt
    $MAIL -s "Some ports failed" $EMAIL < /root/port_problem.txt
  fi
       
  ### Check if already notified that master load balancer failed ###
  cd /root
  if [ -f server_problem_notified.txt ]; then
    exit 1;
  fi
        
  ### Notify that master load balancer failed ###
  cd /root
  MESSAGE="$NOW : Master load balancer failed"
  echo $MESSAGE > lb1_problem.txt
  $MAIL -s "Master load balancer failed" $EMAIL < /root/lb1_problem.txt
  echo "notified" > server_problem_notified.txt
fi

We make it executable :

chmod +x /root/ports_check

And we add it to our crontab :

crontab -e

* * * * * /root/ports_failed  >/dev/null 2>&1

When you get a notification from the script, please run afterward :

/root/ports_check fix

 

15.3 Monitoring from web1 & web2

Monitoring of web cluster is already partially done with monit and munin.

The part that is not covered yet is the monitoring of MySQL replication.

Please read the following article :

Repair MySQL master-master replication

MySQL monitoring is optional but on a production server, problems can happend with MySQL replication so I really recommend using those scripts or something similar to check databases consistency.

 

15.4 Monitoring from remote server

This part is adding extra security by checking important ports (25,53,80,443) from a remote server (install dns-utils for dig):

#!/bin/bash
# Script to check important port on remote webserver
# Copyright (c) 2008 blogama.org
# This script is licensed under GNU GPL version 2.0 or above
# ---------------------------------------------------------------------
### This script does a verification on port 25, 53, 80 and 443 ###
### After 2 failed check it will send a mail notification ###
### To be modified ###
WEBSERVERIP="192.168.1.106"
MAILSERVERIP="192.168.1.106"
EMAIL="[email protected]"
DNSSERVERIP="192.168.1.106"
DOMAINTOCHECKDNS="example.com"
DOMAINIP="192.168.1.106"

###### Do not make modifications below ######
### Binaries ###
MAIL=$(which mail)
TELNET=$(which telnet)
DIG=$(which dig)
### Check if already notified###
cd /root
if [ -f server_problem.txt ]; then
  exit 1;
fi
### Test SMTP ###
(
echo "quit"
) | $TELNET $MAILSERVERIP 25 | grep Connected > /dev/null 2>&1
if [ "$?" -ne "1" ]; then
  echo "PORT CONNECTED"
else
  if [ -f server_problem_first_time_25.txt ]; then
    echo "PORT 25 NOT CONNECTED" >> /root/server_problem.txt
  else
    echo "NOT CONNECTED" > /root/server_problem_first_time_25.txt
  fi
fi
### Test HTTP ###
(
echo "quit"
) | $TELNET $WEBSERVERIP 80 | grep Connected > /dev/null 2>&1
if [ "$?" -ne "1" ]; then
  echo "PORT CONNECTED"
else
  if [ -f server_problem_first_time_80.txt ]; then
    echo "PORT 80 NOT CONNECTED" >> /root/server_problem.txt
  else
    echo "NOT CONNECTED" > /root/server_problem_first_time_80.txt
  fi
fi
### Test HTTPS###
(
echo "quit"
) | $TELNET $WEBSERVERIP 443 | grep Connected > /dev/null 2>&1
if [ "$?" -ne "1" ]; then
  echo "PORT CONNECTED"
else
  if [ -f server_problem_first_time_443.txt ]; then
    echo "PORT 81 NOT CONNECTED" >> /root/server_problem.txt
  else
    echo "NOT CONNECTED" > /root/server_problem_first_time_443.txt
  fi
fi

### Test DNS ###
$DIG $DOMAINTOCHECKDNS @$DNSSERVERIP | grep $DOMAINIP
if [ "$?" -ne "1" ]; then
  echo "PORT CONNECTED"
else
  if [ -f server_problem_first_time_53.txt ]; then
    echo "PORT 53 NOT CONNECTED" >> /root/server_problem.txt
  else
    echo "NOT CONNECTED" > /root/server_problem_first_time_53.txt
  fi
fi
### Send mail notification after 2 failed check ###
if [ -f server_problem.txt ]; then
  $MAIL -s "Server problem" $EMAIL < /root/server_problem.txt
fi

Et voila! Feel free to send me private emails at admin [at] marchost.com or post comments here or on my page : blogama.org

Share this page:

0 Comment(s)