View Single Post
  #1  
Old 29th November 2006, 09:50
nekromancer nekromancer is offline
Junior Member
 
Join Date: Nov 2006
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
Default Gentoo Cluster using heartbeat and drbd problem

Hi, I hope this is the right place to post this since I saw others posting drbd/heartbeat questions here. If there is some mailing list or forum that deals specifically with such things please director me to it.

I have set up 2 identical PCs running Gentoo. They both have DRBD v0.7.21 and Heartbeat v1.2.7 (using ldirectord) installed. I am going for a Hot-Standby (active/passive) system. I have setup the network with 1 ethernet cable for connecting both nodes to a LAN (eth1). There is a crossover ethernet cable hooking up the 2 pcs directly (eth0), this is dedicated for DRBD replication.

Heartbeat is set up to use eth1 to connect to the LAN and send heartbeats. Both nodes are started and everything is find and they share the virtual ip address perfectly. The failover works fine if I test it by turning off hearbeat on the primary node. It also works fine if I unplug the power supply from the primary node. But if I unplug the eth1 network cable the ip address fails-over but it doesn't switch the DRBD disk. The drbd disk remains mounted on the primary node but not on the secondary node even though heartbeat failed over to the secondary node and the secondary node took over the virtual ip address.

The only way I got this to work is by unplugging both the cross over cable (eth0) and the network cable (eth1) at the same time. So drbd gets cut off and so does the network, only then does the secondary node take over both the drbddisk and the ip all together.

seeing this, I decided to use just 1 interface for both drbd and heartbeat (eth1). Simulating a network failure (unplug the eth1 cable) both failover. Then when I reconnect the cable it fails back automatically even though I set autofailback off ! Not only that, data in drbddisk does not get replicated to the other node once connected. Doing a cat /proc/drbd shows both disks in a consistant state some how.

If I set the drbd conf to go Stand onle instead of reconnect, and I handle the drbd disks manually the data does get replicated! This is through me using drbdadm commands, then running heartbeat on the failed node to fail back to.

Basically I don't know why this is happening. What I want is a Active/Passive hot-standby setup. I want drbd on a crossover cable and heartbeat on a network. Once one node fails the other should take over, when the failed node comes back online it should NOT failback, I want the admin to tend the node and he should decide wheather to failback or not; that way drbd can do a sync.

Below is the drbd.conf file I am using (at least the relevant parts)
Note: 172.22.0.x is the crossover cable
Note: 192.168.1.x is the LAN network

Code:
resource mirror {

   protocol C;
   incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";

   startup {
      degr-wfc-timeout 20;    # 20 seconds
   }


   disk {
      on-io-error   detach;
   }


   net {
      ko-count 4;
      on-disconnect stand_alone;
      #on-disconnect reconnect;
   }


   syncer {
      rate 10M;
      group 1;
      al-extents 257;
   }


   on gentoo1 {
      device     /dev/drbd0;
      disk       /dev/sda4;
      address    172.22.0.1:7788;
      meta-disk  internal;
   }


   on gentoo2 {
      device    /dev/drbd0;
      disk      /dev/sda4;
      address   172.22.0.2:7788;
      meta-disk internal;
   }
}

This is the ha.cf config file

Code:
logfile   /var/log/ha-log
logfacility   local0

keepalive 1
deadtime 15
warntime 5

bcast   eth1

auto_failback off

node   gentoo1
node   gentoo2
This is the haresources file

Code:
gentoo1 drbddisk::mirror Filesystem::/dev/drbd0::/ha::reiserfs 192.168.1.3/8/eth1 ldirectord apache2
Thanks in advance!
Reply With Quote
Sponsored Links