The Perfect Load-Balanced & High-Availability Web Cluster With 2 Servers Running Xen On Ubuntu 8.04 Hardy Heron - Page 6

12. Setting up the load balancers (lb1, lb2)

12.1 Enable IPVS On The Load Balancers

First we must enable IPVS on our load balancers. IPVS (IP Virtual Server) implements transport-layer load balancing inside the Linux kernel, so called Layer-4 switching.

echo ip_vs_dh >> /etc/modules
echo ip_vs_ftp >> /etc/modules
echo ip_vs >> /etc/modules
echo ip_vs_lblc >> /etc/modules
echo ip_vs_lblcr >> /etc/modules
echo ip_vs_lc >> /etc/modules
echo ip_vs_nq >> /etc/modules
echo ip_vs_rr >> /etc/modules
echo ip_vs_sed >> /etc/modules
echo ip_vs_sh >> /etc/modules
echo ip_vs_wlc >> /etc/modules
echo ip_vs_wrr >> /etc/modules

Then we do this:

modprobe ip_vs_dh
modprobe ip_vs_ftp
modprobe ip_vs
modprobe ip_vs_lblc
modprobe ip_vs_lblcr
modprobe ip_vs_lc
modprobe ip_vs_nq
modprobe ip_vs_rr
modprobe ip_vs_sed
modprobe ip_vs_sh
modprobe ip_vs_wlc
modprobe ip_vs_wrr

12.2 Install Ultra Monkey (packages) On The Load Balancers

Install Ultra Monkey (packages) on the load balancers by doing the following :

apt-get install ipvsadm ldirectord heartbeat

12.3 Enable Packet Forwarding On The Load Balancers

The load balancers must be able to route traffic to the Apache nodes. Therefore we must enable packet forwarding on the load balancers. Add the following lines to /etc/sysctl.conf :

vi /etc/sysctl.conf

# Enables packet forwarding
net.ipv4.ip_forward = 1

Then do this:

sysctl -p

12.4 Configure heartbeat And ldirectord

Now we have to create three configuration files for heartbeat (carefull with space and tabs if you edit in some text editors, ldirectord is very picky!) :

on lb1 and lb2

vi /etc/ha.d/ha.cf

logfacility local0
bcast eth0 # Linux
mcast eth0 225.0.0.1 694 1 0
auto_failback on
node lb1.example.com
node lb2.example.com
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster

Important: As nodenames we must use the output of both :

uname -n

vi /etc/ha.d/haresources

lb1.example.com \
 ldirectord::ldirectord.cf \
 LVSSyncDaemonSwap::master \
 IPaddr2::192.168.1.106/24/eth0/192.168.1.255 \
 IPaddr2::192.168.1.107/24/eth0/192.168.1.255

IPs 192.168.1.106 and 107 will be used later for websites (example.com and yoursite.com).

This file should be the same on both nodes, no matter if you start to create the file on lb1 or lb2!

vi /etc/ha.d/authkeys

auth 3 3 md5 somerandomstring

somerandomstring is a password which the two heartbeat daemons on lb1 and lb2 use to authenticate against each other. Use your own string here. You have the choice between three authentication mechanisms. I use md5 as I believe it is the most secure one.

/etc/ha.d/authkeys should be readable by root only, therefore we do this:

chmod 600 /etc/ha.d/authkeys

ldirectord is the actual load balancer. We are going to configure our two load balancers (lb1.example.com and lb2.example.com) in an active/passive setup, which means we have one active load balancer, and the other one is a hot-standby and becomes active if the active one fails. To make it work, we must create the ldirectord configuration file /etc/ha.d/ldirectord.cf which again must be identical on lb1 and lb2.

vi /etc/ha.d/ldirectord.cf

checktimeout=5
checkinterval=5
autoreload=no
logfile="/var/log/ldirectord.log"
quiescent=no
#fork=yes

#FOR SMTP
virtual=192.168.1.106:25
        real=192.168.1.104:25 gate
        fallback=192.168.1.105:25 gate
        service=none
        scheduler=wlc
        protocol=tcp
        checktype=connect
virtual=192.168.1.107:25
        real=192.168.1.104:25 gate
        fallback=192.168.1.105:25 gate
        service=none
        scheduler=wlc
        protocol=tcp
        checktype=connect

#FOR DNS - CONNECT DOESNT WORK, MUST BE PATCHED BUT PING IS OK
virtual=192.168.1.106:53
        real=192.168.1.104:53 gate
        fallback=192.168.1.105:53 gate
        service=none
        scheduler=wlc
        checktype=ping
        protocol=udp
virtual=192.168.1.106:53
        real=192.168.1.104:53 gate
        fallback=192.168.1.105:53 gate
        service=dns
        scheduler=wlc
        checktype=ping
        protocol=tcp
#FOR HTTP
virtual=192.168.1.106:80
        real=192.168.1.104:80 gate
        real=192.168.1.105:80 gate
        service=http
        request="ldirectord.php"
        receive="Connected to MySQL"
        scheduler=wlc
        protocol=tcp
        checktype=negotiate
        persistent=28800
virtual=192.168.1.107:80
        real=192.168.1.104:80 gate
        real=192.168.1.105:80 gate
        service=http
        request="ldirectord.php"
        receive="Connected to MySQL"
        scheduler=wlc
        protocol=tcp
        checktype=negotiate
        persistent=28800

#FOR  WEBMAIL
virtual=192.168.1.106:81
        real=192.168.1.104:81 gate
        fallback=192.168.1.105:81 gate
        service=http
        request="ldirectord.php"
        receive="Connected to MySQL"
        scheduler=wlc
        protocol=tcp
        checktype=negotiate
virtual=192.168.1.107:81
        real=192.168.1.104:81 gate
        fallback=192.168.1.105:81 gate
        service=http
        request="ldirectord.php"
        receive="Connected to MySQL"
        scheduler=wlc
        protocol=tcp
        checktype=negotiate
#FOR POP3
virtual=192.168.1.106:110
        real=192.168.1.104:110 gate
        fallback=192.168.1.105:110 gate
        service=pop
        checktype = connect
        scheduler=wlc
        protocol=tcp
#FOR IMAP
virtual=192.168.1.106:143
        real=192.168.1.104:143 gate
        fallback=192.168.1.105:143 gate
        service=imap
        scheduler=wlc
        protocol=tcp
#FOR HTTPS
###Un-comment this part if you will use HTTPS
#virtual=192.168.1.106:443
#        real=192.168.1.104:443 gate
#        real=192.168.1.105:443 gate 2
#        service=http
#        request="ldirectord.php"
#        receive="Connected to MySQL"
#        scheduler=wlc
#        protocol=tcp
#        checktype=negotiate
#        persistent=28800
#
#virtual=192.168.1.107:443
#        real=192.168.1.104:443 gate
#        real=192.168.1.105:443 gate 2
#        service=http
#        request="ldirector.html"
#        receive="Test Page"
#        scheduler=wlc
#        protocol=tcp
#        checktype=negotiate
#        persistent=28800
#FOR IMAP SSL
virtual=192.168.1.106:993
        real=192.168.1.104:993 gate
        fallback=192.168.1.105:993 gate
        service=imaps
        scheduler=wlc
        protocol=tcp
#FOR POP3 SSL
virtual=192.168.1.106:995
        real=192.168.1.104:995 gate
        fallback=192.168.1.105:995 gate
        service=pops
        checktype = ping
        scheduler=wlc
        protocol=tcp
#FOR MONIT MONITORING #1
virtual=192.168.1.106:10001
        real=192.168.1.104:10001 gate
        checktype = on
#FOR MONIT MONITORING #2
virtual=192.168.1.106:20001
        real=192.168.1.105:20001 gate
        checktype = on

virtual is the virtual IP of the services (eg 192.168.1.106 and 107)

real are the real servers IP in the cluster (192.168.1.104 and 105)

fallback is the backup server. If real IP fails then requests are forwarded to the fallback IP but they are not load balanced.

This config is based on my personal experience. Some services are load balanced, other not. Everything related to mail is not load balanced. You dont want one message to arrive on the first server and the second on the other (unless you have shared storage). If you dont have a very high traffic of mail there is no point to load balance mail (but its still highly available), the same for DNS. Later on we will rsync the messages on the second server so we will have a backup in the event that the first server fails.

About port 81. We will use it for webmail. Also I use it for our e-commerce website administration because of images upload. Later on we will set up rsync from web1.example.com to web2.example.com but not the other way around. Basically you dont want to upload a file on web2.example.com (unless you use shared storage).

If you want to get more info on the subject search for "ldirectord man".

Afterwards we create the system startup links for heartbeat and remove those of ldirectord because ldirectord will be started by the heartbeat daemon:

update-rc.d -f ldirectord remove

Finally we start heartbeat (and with it ldirectord):

/etc/init.d/ldirectord stop
/etc/init.d/heartbeat start

12.5 Test the load balancers

Let's check if both load balancers work as expected:

ip addr sh eth0

The active load balancer lb1.example.com should list the virtual IP addresses (192.168.1.106 and 192.168.1.107):

2: eth0:  mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.102/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.106/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.107/24 brd 192.168.1.255 scope global secondary eth0

The hot-standby (lb2.example.com) should show something like this:

2: eth0:  mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:0c:29:34:d7:7e brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.103/24 brd 192.168.1.255 scope global eth0

Now try :

ldirectord ldirectord.cf status

Output on the active load balancer (lb1) :

ldirectord for /etc/ha.d/ldirectord.cf is running with pid: 5321

Output on the hot standby load balancer (lb2) :

ldirectord is stopped for /etc/ha.d/ldirectord.cf

Now we will check if ports are forwarded correctly :

ipvsadm -L -n | grep :80

You should see something like this on lb1.example.com:

  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.1.106:80 wlc
  -> 192.168.1.104:80               Route   1      0          0         
  -> 192.168.1.105:80               Route   0      0          0
TCP  192.168.1.107:80 wlc
  -> 192.168.1.104:80               Route   1      0          0         
  -> 192.168.1.105:80               Route   0      0          0

And nothing on lb2.example.com.

One last test :

/etc/ha.d/resource.d/LVSSyncDaemonSwap master status

Output on the active load balancer:

master running (ipvs_syncmaster pid: 5470)

Output on the hot-standby:

master stopped

12.6 Testing the load balancers failover

lb1.example.com

/etc/init.d/heartbeat stop

The ipvsadm command :

ipvsadm -L -n | grep :80

should output nothing.

On lb2.example.com

ipvsadm -L -n | grep :80

and you should see the following :

  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  192.168.1.106:80 wlc
  -> 192.168.1.104:80               Route   1      0          0         
  -> 192.168.1.105:80               Route   0      0          0
TCP  192.168.1.107:80 wlc
  -> 192.168.1.104:80               Route   1      0          0         
  -> 192.168.1.105:80               Route   0      0          0

Restart heartbeat service on lb1.example.com :

/etc/init.d/heartbeat start

If you retry the ipvsadm command on both you will see that lb1.example.com is now active while lb2.example.com went back on standby.

If your test went fine you can go on.