Go Back   HowtoForge Forums | HowtoForge - Linux Howtos and Tutorials > Linux Forums > Server Operation

Do you like HowtoForge? Please consider supporting us by becoming a subscriber.
Reply
 
Thread Tools Display Modes
  #1  
Old 29th November 2006, 09:50
nekromancer nekromancer is offline
Junior Member
 
Join Date: Nov 2006
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
Default Gentoo Cluster using heartbeat and drbd problem

Hi, I hope this is the right place to post this since I saw others posting drbd/heartbeat questions here. If there is some mailing list or forum that deals specifically with such things please director me to it.

I have set up 2 identical PCs running Gentoo. They both have DRBD v0.7.21 and Heartbeat v1.2.7 (using ldirectord) installed. I am going for a Hot-Standby (active/passive) system. I have setup the network with 1 ethernet cable for connecting both nodes to a LAN (eth1). There is a crossover ethernet cable hooking up the 2 pcs directly (eth0), this is dedicated for DRBD replication.

Heartbeat is set up to use eth1 to connect to the LAN and send heartbeats. Both nodes are started and everything is find and they share the virtual ip address perfectly. The failover works fine if I test it by turning off hearbeat on the primary node. It also works fine if I unplug the power supply from the primary node. But if I unplug the eth1 network cable the ip address fails-over but it doesn't switch the DRBD disk. The drbd disk remains mounted on the primary node but not on the secondary node even though heartbeat failed over to the secondary node and the secondary node took over the virtual ip address.

The only way I got this to work is by unplugging both the cross over cable (eth0) and the network cable (eth1) at the same time. So drbd gets cut off and so does the network, only then does the secondary node take over both the drbddisk and the ip all together.

seeing this, I decided to use just 1 interface for both drbd and heartbeat (eth1). Simulating a network failure (unplug the eth1 cable) both failover. Then when I reconnect the cable it fails back automatically even though I set autofailback off ! Not only that, data in drbddisk does not get replicated to the other node once connected. Doing a cat /proc/drbd shows both disks in a consistant state some how.

If I set the drbd conf to go Stand onle instead of reconnect, and I handle the drbd disks manually the data does get replicated! This is through me using drbdadm commands, then running heartbeat on the failed node to fail back to.

Basically I don't know why this is happening. What I want is a Active/Passive hot-standby setup. I want drbd on a crossover cable and heartbeat on a network. Once one node fails the other should take over, when the failed node comes back online it should NOT failback, I want the admin to tend the node and he should decide wheather to failback or not; that way drbd can do a sync.

Below is the drbd.conf file I am using (at least the relevant parts)
Note: 172.22.0.x is the crossover cable
Note: 192.168.1.x is the LAN network

Code:
resource mirror {

   protocol C;
   incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";

   startup {
      degr-wfc-timeout 20;    # 20 seconds
   }


   disk {
      on-io-error   detach;
   }


   net {
      ko-count 4;
      on-disconnect stand_alone;
      #on-disconnect reconnect;
   }


   syncer {
      rate 10M;
      group 1;
      al-extents 257;
   }


   on gentoo1 {
      device     /dev/drbd0;
      disk       /dev/sda4;
      address    172.22.0.1:7788;
      meta-disk  internal;
   }


   on gentoo2 {
      device    /dev/drbd0;
      disk      /dev/sda4;
      address   172.22.0.2:7788;
      meta-disk internal;
   }
}

This is the ha.cf config file

Code:
logfile   /var/log/ha-log
logfacility   local0

keepalive 1
deadtime 15
warntime 5

bcast   eth1

auto_failback off

node   gentoo1
node   gentoo2
This is the haresources file

Code:
gentoo1 drbddisk::mirror Filesystem::/dev/drbd0::/ha::reiserfs 192.168.1.3/8/eth1 ldirectord apache2
Thanks in advance!
Reply With Quote
Sponsored Links
  #2  
Old 30th November 2006, 09:16
nekromancer nekromancer is offline
Junior Member
 
Join Date: Nov 2006
Posts: 2
Thanks: 0
Thanked 0 Times in 0 Posts
Default

meh, so much for this post.
The nodes were split-brained. Fixing it was to add a STONITH device (managable remote power switch) to each node.
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
High Availability Samba cluster - DRBD + Heartbeat djalex Server Operation 58 25th May 2007 19:38
Mirroring ISPConfig with DRBD rodriglm General 5 2nd January 2006 12:48


All times are GMT +2. The time now is 11:42.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.