PDA

View Full Version : Loadbalanced High-Availability Apache Cluster Problem


jason331
2nd October 2008, 19:13
Hi everyone,

I am following the HowToForge tutorial that details how to setup a high-availability load-balanced apache2 cluster (http://www.howtoforge.com/high_availability_loadbalanced_apache_cluster) and I have run into some problems. I am at step 7 on page 4 of the tutorial where the author states “You can now access the web site that is hosted by the two Apache nodes by typing http://192.168.0.105 in the browser”, but that step is not working for me. I get “the page could not be displayed” when I try to go to my virtual IP in my browser. I also cannot telnet to port 80 or port 443 on my virtual IP. All the tests on page 3 of the tutorial (ip addr sh eth0, ldirectord ldirectord.cf status, ipvsadm -L –n, and /etc/ha.d/resource.d/LVSSyncDaemonSwap master status) pass successfully with the exact same results as shown in the examples.

Any thoughts as to what I am doing wrong?

Here’s what I have:


All servers are on the same network segment with no firewalls in between.
balancer1 – load balancer running Debian etch 4.0r4, IP address: 192.168.0.12
balancer2 – load balancer running Debian etch 4.0r4, IP address: 192.168.0.13
maia1 – web server running OpenSuSE 11 and Apache2, IP address: 192.168.0.7
maia2 – web server running OpenSuSE 11 and Apache2, IP address: 192.168.0.6
Virtual cluster IP: 192.168.0.8


Balancer1 ha.cf:
logfacility local0
bcast eth0 # Linux
mcast eth0 225.0.0.1 694 1 0
auto_failback off
node balancer1
node balancer2
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster

Balancer2 ha.cf:
logfacility local0
bcast eth0 # Linux
mcast eth0 225.0.0.1 694 1 0
auto_failback off
node balancer1
node balancer2
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster

Balancer1 haresources:
balancer1 \
ldirectord::ldirectord.cf \
LVSSyncDaemonSwap::master \
IPaddr2::192.168.0.8/24/eth0/192.168.0.255

Balancer2 haresources:
balancer1 \
ldirectord::ldirectord.cf \
LVSSyncDaemonSwap::master \
IPaddr2::192.168.0.8/24/eth0/192.168.0.255

Balancer1 ldirectord.cf:
checktimeout=10
checkinterval=2
autoreload=no
logfile="local0"
quiescent=yes

## HTTP
virtual=192.168.0.8:443
real=192.168.0.7:443 gate
real=192.168.0.6:443 gate
fallback=127.0.0.1:443 gate
service=https
request="ldirector.html"
receive="Test Page"
scheduler=rr
protocol=tcp
checktype=negotiate

## HTTPS
virtual=192.168.0.8:80
real=192.168.0.7:80 gate
real=192.168.0.6:80 gate
fallback=127.0.0.1:80 gate
service=http
request="ldirector.html"
receive="Test Page"
scheduler=rr
protocol=tcp
checktype=negotiate

Balancer2 ldirectord.cf:
checktimeout=10
checkinterval=2
autoreload=no
logfile="local0"
quiescent=yes

## HTTP
virtual=192.168.0.8:443
real=192.168.0.7:443 gate
real=192.168.0.6:443 gate
fallback=127.0.0.1:443 gate
service=https
request="ldirector.html"
receive="Test Page"
scheduler=rr
protocol=tcp
checktype=negotiate

## HTTPS
virtual=192.168.0.8:80
real=192.168.0.7:80 gate
real=192.168.0.6:80 gate
fallback=127.0.0.1:80 gate
service=http
request="ldirector.html"
receive="Test Page"
scheduler=rr
protocol=tcp
checktype=negotiate

Thank you very much in advance for looking into this. Please let me know if there is any other information I can provide.

falko
3rd October 2008, 13:24
Are there any errors in your logs?

jason331
3rd October 2008, 18:04
I assume you mean /var/log/messages, right? If so, here's an excerpt of what I all see in that log (repeating every so often):

Oct 3 03:54:03 balancer1 ldirectord[2547]: Quiescent real server: 192.168.0.6:443 ( x 192.168.0.8:443) (Weight set to 0)
Oct 3 03:54:08 balancer1 ldirectord[2547]: Restored real server: 192.168.0.6:443 ( x 192.168.0.8:443) (Weight set to 1)
Oct 3 04:13:22 balancer1 -- MARK --
Oct 3 04:33:22 balancer1 -- MARK --
Oct 3 04:53:23 balancer1 -- MARK --

Is there another log somewhere I can check?

falko
4th October 2008, 14:33
What's in /var/log/syslog?

Also, did you check the logs on the web servers?

jason331
4th October 2008, 21:19
Here's the /var/log/syslog contents from balancer1:
Oct 4 06:25:06 balancer1 syslogd 1.4.1#18: restart.
Oct 4 06:53:43 balancer1 -- MARK --
Oct 4 07:13:43 balancer1 -- MARK --
Oct 4 07:17:01 balancer1 /USR/SBIN/CRON[4509]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Oct 4 07:33:43 balancer1 -- MARK --
Oct 4 07:40:20 balancer1 ldirectord[2547]: Quiescent real server: 192.168.0.6:443 ( x 192.168.0.8:443) (Weight set to 0)
Oct 4 07:40:24 balancer1 ldirectord[2547]: Restored real server: 192.168.0.6:443 ( x 192.168.0.8:443) (Weight set to 1)
Oct 4 07:53:44 balancer1 -- MARK --

Here's the /var/log/syslog contents from balancer2:
Oct 4 06:25:27 balancer2 syslogd 1.4.1#18: restart.
Oct 4 06:50:17 balancer2 -- MARK --
Oct 4 07:10:18 balancer2 -- MARK --
Oct 4 07:17:02 balancer2 /USR/SBIN/CRON[2581]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Oct 4 07:30:21 balancer2 -- MARK --
Oct 4 07:50:22 balancer2 -- MARK --
Oct 4 08:10:22 balancer2 -- MARK --
Oct 4 08:17:01 balancer2 /USR/SBIN/CRON[2586]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Oct 4 08:30:26 balancer2 -- MARK --
Oct 4 08:50:38 balancer2 -- MARK --
Oct 4 09:10:38 balancer2 -- MARK --
Oct 4 09:17:01 balancer2 /USR/SBIN/CRON[2591]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Oct 4 09:30:38 balancer2 -- MARK --
Oct 4 09:50:44 balancer2 -- MARK --
Oct 4 10:10:49 balancer2 -- MARK --
Oct 4 10:17:01 balancer2 /USR/SBIN/CRON[2596]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)

There are hundreds of these repeating over and over in the Apache2 logs on maia1 and pretty much nothing else except other non-related HTTP requests to this server (since it is currently an active web server):
192.168.0.12 - - [04/Oct/2008:13:13:16 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805"
192.168.0.12 - - [04/Oct/2008:13:13:18 -0500] "GET /ldirector.html HTTP/1.1" 200 9
192.168.0.12 - - [04/Oct/2008:13:13:18 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805"
192.168.0.12 - - [04/Oct/2008:13:13:20 -0500] "GET /ldirector.html HTTP/1.1" 200 9
192.168.0.12 - - [04/Oct/2008:13:13:21 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805"

I see the same thing in the Apache2 logs on maia2:
192.168.0.12 - - [05/Oct/2008:21:21:52 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805"
192.168.0.12 - - [05/Oct/2008:21:21:54 -0500] "GET /ldirector.html HTTP/1.1" 200 9
192.168.0.12 - - [05/Oct/2008:21:21:54 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805"
192.168.0.12 - - [05/Oct/2008:21:21:57 -0500] "GET /ldirector.html HTTP/1.1" 200 9
192.168.0.12 - - [05/Oct/2008:21:21:57 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805"

Does that help?

falko
5th October 2008, 21:03
What's the output of ifconfig on both load balancers?
Can you ping the virtual IP?

jason331
5th October 2008, 23:11
Balancer1 ifconfig:
balancer1:~# ifconfig
eth0 Link encap:Ethernet HWaddr 00:08:74:9E:47:12
inet addr:192.168.0.12 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::208:74ff:fe9e:4712/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3845629 errors:0 dropped:0 overruns:1 frame:0
TX packets:3192221 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1453406895 (1.3 GiB) TX bytes:371182676 (353.9 MiB)
Interrupt:11 Base address:0x2c00

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:560 (560.0 b) TX bytes:560 (560.0 b)


Balancer2 ifconfig:
eth0 Link encap:Ethernet HWaddr 00:03:FF:92:95:F0
inet addr:192.168.0.13 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::203:ffff:fe92:95f0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1042693 errors:0 dropped:0 overruns:0 frame:0
TX packets:466587 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:809290912 (771.7 MiB) TX bytes:85876112 (81.8 MiB)
Interrupt:11 Base address:0xec00

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:560 (560.0 b) TX bytes:560 (560.0 b)

The virtual IP is pingable:
C:\Documents and Settings\User>ping balancer

Pinging balancer.mydomain.com [192.168.0.8] with 32 bytes of data:

Reply from 192.168.0.8: bytes=32 time<1ms TTL=64
Reply from 192.168.0.8: bytes=32 time<1ms TTL=64
Reply from 192.168.0.8: bytes=32 time<1ms TTL=64
Reply from 192.168.0.8: bytes=32 time<1ms TTL=64

Ping statistics for 192.168.0.8:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms

There are no firewalls or anything between any of the 4 nodes I am working with. Everything else is also pingable as well (both balancers individually, both web servers, etc). It's almost acting like port forwarding isn't working correctly. I can't telnet to ports 80 or 443 on the virtual IP even though I can ping it. As I noted in my original post, all of the tests pass successfully that verify the actual cluster is running (ip addr sh eth0, ldirectord ldirectord.cf status, ipvsadm -L –n, and /etc/ha.d/resource.d/LVSSyncDaemonSwap master status). I literally copied and pasted the tutorial examples in PuTTY windows when I set these up (changing IPs where appropriate). I even went so far as to download Debian sarge and went through the tutorial thinking it was a problem with etch, but I got the same results then too. One more note , I can telnet directly to ports 80 and 443 on the web servers directly so I know they are working (as well as I can browse webpages on them).

falko
6th October 2008, 16:35
To be honest, I'm not sure what's wrong. Maybe you should try this tutorial instead: http://www.howtoforge.com/high-availability-load-balancer-haproxy-heartbeat-debian-etch

fazi_puri
12th November 2009, 09:04
hey buddy have you resolved the problem which you were facing because i am facing the same problem and stuck in test# 7 on page 4............