HowtoForge Forums | HowtoForge - Linux Howtos and Tutorials

HowtoForge Forums | HowtoForge - Linux Howtos and Tutorials (http://www.howtoforge.com/forums/index.php)
-   Server Operation (http://www.howtoforge.com/forums/forumdisplay.php?f=5)
-   -   High Availability Samba cluster - DRBD + Heartbeat (http://www.howtoforge.com/forums/showthread.php?t=6292)

djalex 18th August 2006 17:00

High Availability Samba cluster - DRBD + Heartbeat
 
Hello everyone,

This is my first experience with Linux and I am trying to setup a high availability samba cluster with DRBD and Heartbeat.

E N V I R O N M E N T _ D E T A I L S

Primary server
Server name: test02
IP address: 192.168.50.152
Subnet mask: 255.255.255.0
OS: CentOS 4.3 (Kernel version: 2.6.9-34.EL)

Applications installed: DRBD version 0.7.9, Heartbeat version 2.0.7, SAMBA version 3.0.10-1.4E.6

Secondary server
Server name: test01
IP address: 192.168.50.151
Subnet mask: 255.255.255.0
OS: CentOS 4.3 (Kernel version: 2.6.9-34.EL)

Applications installed: DRBD version 0.7.9, Heartbeat version 2.0.7, SAMBA version 3.0.10-1.4E.6

Client system
System name: test03
IP address: 192.168.50.153
OS: Windows XP Professional sp2

SAMBA is serviced on the IP address 192.168.50.195

Configuration files are as follows:

drbd.conf (test01/test02)

resource r0
{

protocol A;
incon-degr-cmd "halt -f";

startup
{
degr-wfc-timeout 120; # 2 minutes
}

disk
{
on-io-error detach;
}

net
{

}

syncer
{
rate 10M;
group 1;
al-extents 257;
}

on test01
{
device /dev/drbd0;
disk /dev/hda5;
address 192.168.50.151:7789;
meta-disk internal;
}

on test02
{
device /dev/drbd0;
disk /dev/hda5;
address 192.168.50.152:7789;
meta-disk internal;
}
}

ha.cf (test01/test02)

logfacility local0
logfile /var/log/ha-log
debug 1
bcast eth0
keepalive 2
deadtime 10
auto_failback off
node test01
node test02
ping test01
ping test02
#respawn hacluster /user/lib/heartbeat/ipfail

haresources (test01/test02)
test02 IPaddr::192.168.50.195
test02 drbddisk::r0 Filesystem::/dev/drbd0 smb

authkeys (test01/test02)
auth 3
3 md5 goose

smb.conf (test01)
[global]
workgroup = Workgroup
server string = SAMBA_TEST
admin users = root
share modes = yes
browseable = yes
username map = /etc/samba/smbusers
interfaces = 192.168.50.195

[goose01]
path = /mnt/goose01
writeable = yes
guest ok = yes

smb.conf (test02)
[global]
workgroup = Workgroup
server string = SAMBA_TEST
admin users = root
share modes = yes
browseable = yes
username map = /etc/samba/smbusers
interfaces = 192.168.50.195

[goose02]
path = /mnt/goose02
writeable = yes
guest ok = yes

smbusers (test01/test02)
# Unix_name = SMB_name1 SMB_name2 ...
# root = administrator admin
# nobody = guest pcguest smbguest
root = root

P R O B L E M

While client test03 attempts to access SAMBA services on 192.168.50.195, the primary server reboots.

T R O U B L E S H O O T I N G

The steps taken (to the point of failure) are as follows:

1. Started drbd on test02 (primary)
2. Started drbd on test01 (secondary)
3. Ran the command drbdadm primary all on test02
4. Ran the command mount /dev/drbd0 /mnt/goose02 on test02
5. Started samba on test02 (primary)
6. Created test files hello and world in the /mnt/goose02 share. (SAMBA was already configured with the /mnt/goose02 folder.)
7. I then try accessing it from the windows system using service IP address 192.168.50.195. If it does not crash, I can browse the files on 192.168.50.195 momentarily. Then the primary server reboots without warning.
8. After the primary server crashes, I ran the command drbdadm primary all on the secondary server, in order to mount the virtual block.
9. Then I ran the command mount dev/drbd0 /mnt/goose01 share on test01. (SAMBA was already configured with the /mnt/goose01 folder.)
10. Started the samba service on test01.
11. The files are accessible from the windows system on service IP 192.168.50.195

I tried to review the logs present in /var/log but I was not able to find any
conclusive evidence for the cause of the crash. High availability seems to be
working... but the tasks are manual as described in the above steps.

O B S E R V A T I O N

I suspect that heartbeat maybe the problem - specifically the virtual IP address. I have noticed that when I startup heartbeat, both the primary and secondary server have the virtual IP address of 192.168.50.195 for the initial period. After sometime, the virtual IP disappears from the secondary server (giving me the impression that it takes a while for heartbeat to get settled), but then the windows system is not able to ping the virtual IP address. Only after making manual entries for the IPaddress on both primary and secondary servers, its possible to ping the service address from the windows client. (Manual entry is made by typing the command /etc/ha.d/resource.d/IPaddr 192.168.50.195 start on primary server and /etc/ha.d/resource.d/IPaddr 192.168.50.195 stop on secondary server.

I need help with the following issues:
1. Feedback on the cause of the server crash and how to avoid it.
2. Suggestions to automate these manual tasks.
3. Feedback on the cluster configuration and scope for improvement.

Regards,
Alex

falko 19th August 2006 15:07

Did you check this tutorial? Sounds like something is wrong with your heartbeat configuration. Please compare your heartbeat configuration with the one from the tutorial.

djalex 21st August 2006 20:04

Hi Falko,

Extremely pleased to see your response to this thread. I have read your articles on high availability installations and was very impressed with the step-by-step explanation of how it was implemented (I have used your articles as a guide for my installation).

In reference to the high availability samba setup, I checked the following tutorial links:
http://www.linux-ha.org/ha.cf
http://www.linux-ha.org/haresources
http://www.linux-ha.org/authkeys

However, I was not able to find any option which I found suitable to the existing heartbeat configuration. If you have any suggestions to be made to the existing heartbeat configuration files (based on your experience), I shall try them out. Its just that I have tried all possibilities from my side.. to no avail. Your expert guidance in this regard could provide the breakthrough I need... in order to achieve the final setup successfully.
Awaiting your response.....

Cheers,
Alex

falko 22nd August 2006 15:02

Seems I forgot the link in my previous post... :o http://www.howtoforge.com/high_avail...drbd_heartbeat

What's in /etc/heartbeat/ha.cf and /etc/heartbeat/haresources?

Just found haresources in your first post:
Code:

test02 IPaddr::192.168.50.195
test02 drbddisk::r0 Filesystem::/dev/drbd0 smb

This should be just one line. In my tutorial, it's

Code:

server1  IPaddr::192.168.0.174/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/data::ext3 nfs-kernel-server

djalex 25th August 2006 16:23

Hi Falko,

The updated configuration files are as follows (changes indicated in italics)

ha.cf (test01/test02)

logfacility local0
logfile /var/log/ha-log
debug 1
bcast eth0
keepalive 2
deadtime 10
initdead 30
auto_failback off
node test01
node test02
ping test01
ping test02
#respawn hacluster /user/lib/heartbeat/ipfail

haresources (test01/test02)

test02 IPaddr::192.168.50.195/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0 smb

However, primary linux server crashes when windows client tries to
access the files with Samba. Please help.

Regards,
Alex

falko 26th August 2006 15:31

Quote:

Originally Posted by djalex
However, primary linux server crashes when windows client tries to
access the files with Samba. Please help.

Are there any errors in the logs?

djalex 27th August 2006 13:03

Hi Falko,

I havent been able to note any particular errors which determines the cause of the crash. I guess I would'nt be struggling so much if I had to tackle this problem on Windows. Neverthless, I consider this Linux issue as a challenging and learning experience. Moreover, I am extremely glad that you are willing to offer your guidance on this problem. If there are any specific logs which you require, I am willing to post it for your review.

Regards,
Alex

falko 28th August 2006 11:17

Usually the logs are in /var/log.

djalex 30th August 2006 13:02

2 Attachment(s)
Hi Falko,

Made the following changes in drbd.conf (indicated in italics)

drbd.conf (test01/test02)

resource r0
{

protocol C;
#incon-degr-cmd "halt -f";

startup
{
degr-wfc-timeout 120; # 2 minutes
}

disk
{
on-io-error detach;
}

net
{

}

syncer
{
rate 10M;
group 1;
al-extents 257;
}

on test01
{
device /dev/drbd0;
disk /dev/hda5;
address 192.168.50.151:7789;
meta-disk internal;
}

on test02
{
device /dev/drbd0;
disk /dev/hda5;
address 192.168.50.152:7789;
meta-disk internal;
}
}

Then I cleared the files in /var/log. Started up drbd, heartbeat and samba on test01 and test02. While trying to access the samba files on service address 192.168.50.195, the primary server crashed out again. The fresh logs are attached with this post. Kindly help.

Regards,
Alex

Attachment 225

Attachment 226

falko 1st September 2006 00:09

Please post the logs here directly instead of attaching them.


All times are GMT +2. The time now is 18:43.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.