View Single Post
  #1  
Old 5th August 2008, 14:39
sebastienp sebastienp is offline
Junior Member
 
Join Date: Mar 2008
Posts: 16
Thanks: 5
Thanked 0 Times in 0 Posts
Default VMWare replication and failover

OK, accuracy :

I have no problem with vm1 when started on srv1 : it gets its IP (192.168.1.20 staticaly configured), I can access it.
But when I disconnect srv1, even if the instance goes online on srv2, vm1 over srv2 doesn't get any IP, as far as eth0 doesn't exists anymore on srv2.

Is this normal ?
Do someone have a clue ?

Thank you in advance,
S.

=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

Hi there,

Once again many thanks for the time you spent achieving these howtos. It helps a lot !!!

Sorry to burden, but I have questions regarding the "Virtual Machine Replication & Failover with VMWare Server & Debian Etch (4.0)" howto.

It looks like I missed something...

OK, I have 2 physical nodes:
srv1:
eth0 : 192.168.1.11/24 - eth1 : 172.16.0.1/20 (heartbeat)
srv2:
eth0 : 192.168.1.12/24 - eth1 : 172.16.0.2/20 (heartbeat)

DRBD and Heartbeat are working well.
#
#srv1:~# cat /proc/drbd
#version: 0.7.21 (api:79/proto:74)
#SVN Revision: 2326 build by root@srv1.site.local, 2008-07-22 22:14:19
# 0: cs:Connected st:Primary/Secondary ld:Consistent
# ns:2236 nr:0 dw:100 dr:2237 al:0 bm:27 lo:0 pe:0 ua:0 ap:0
#srv1:~#
#srv1:~# /etc/init.d/heartbeat status
#heartbeat OK [pid 2645 et al] is running on srv1.site.local #[srv1.site.local]...
#srv1:~#

Here are the config files:

*drbd.conf :
resource vm1 {
protocol C;
incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
startup {
wfc-timeout 10;
degr-wfc-timeout 30;
}
disk {
on-io-error detach;
}
net {
max-buffers 20000;
unplug-watermark 12000;
max-epoch-size 20000;
}
syncer {
rate 500M;
group 1;
al-extents 257;
}
on srv1.site.local {
device /dev/drbd0;
disk /dev/cciss/c0d0p7;
address 172.16.0.1:7789;
meta-disk internal;
}
on srv2.site.local {
device /dev/drbd0;
disk /dev/cciss/c0d0p7;
address 172.16.0.2:7789;
meta-disk internal;
}
}

*ha.cf :
logfile /var/log/ha-log
gfile /var/log/ha-log
logfacility local0
keepalive 1
deadtime 10
warntime 10
udpport 694
bcast eth1

logfacility local0
keepalive 1
deadtime 10
warntime 10
udpport 694
bcast eth1
auto_failback on
node srv1.site.local
node srv2.site.local
ping 192.168.1.1
respawn hacluster /usr/lib/heartbeat/ipfail

*authkeys :
auth 1
1 md5 secret

*haresources :
srv1.site.local 192.168.1.10 drbddisk::vm1 Filesystem::/dev/drbd0::/var/vm::ext3 vmstart

vmstart points to the correct files in /var/vm.

VMWare server v.1.0.5 is installed and working on both servers, and the VMWare instance vm1.site.local is created on srv1.
Hosts are declared in /etc/hosts.

What I understood was when booting vm1, it will get the IP address (192.168.1.10 for instance) configured in haresources.

But when I boot vm1, it gets an IP via DHCP.
I can access its services via this IP, but I don't have failover.
When disconnecting srv1, the instance goes online on srv2, but eth0 doesn't exists anymore ! It is declared in /etc/network/interfaces as dhcp but it's not up.
Trying ifup eth0, I have :
SIOCSIFADDR: No such device
eth0: ERROR while getting interface flags: No such device (twice)
Bind socket to interface: No such device
Failed to bring up eth0

If I set another IP staticaly on vm1 (let's say 1.20), I don't have failover since I loose 1.20 as soon as I disconnect srv1... even if the VM switches to srv2, with 1.10 IP !
Once again, eth0 disappears.

If I set the haresources's IP statically on vm1 (iface eth0 inet static address 192.168.1.10 ...),
then I access srv1 (or srv2, depending which server holds the eth0:0...) instead of vm1.

Could you please be so kind to explain with more details what sould theorically happend ?
What if I want to configure several virtual machines ?
Did I miss something ? Did I misunderstood ?

Many thanks for your support,
S.

Last edited by sebastienp; 6th August 2008 at 11:43.
Reply With Quote
Sponsored Links