Openfiler 2.3 Active/Passive Cluster (heartbeat,DRBD) With Offsite Replication Node - Page 3

10. Test Recovery Of filer01 And filer02

Now we are going to see what happens if filer01 and filer02 are destroyed due to anything and we have to rebuild from our replication node.

First shutdown filer01 and filer02:

[email protected] ~# shutdown -h now

[email protected] ~# shutdown -h now

Now set up two complete new filer01 and filer02 from step 1. to step 3. From there on our recovery will be slightly different to the installation.


10.1 DRBD Configuration

Copy the drbd.conf and lvm.conf file from filer03 to filer01 and filer02:

[email protected] ~# scp /etc/drbd.conf [email protected]:/etc/drbd.conf
[email protected] ~# scp /etc/drbd.conf [email protected]:/etc/drbd.conf
[email protected] ~# scp /etc/lvm/lvm.conf [email protected]:/etc/lvm/lvm.conf
[email protected] ~# scp /etc/lvm/lvm.conf [email protected]:/etc/lvm/lvm.conf

Initiate the upper resources:

[email protected] ~# drbdadm create-md meta
[email protected] ~# drbdadm create-md data

[email protected] ~# drbdadm create-md meta
[email protected] ~# drbdadm create-md data

Start DRBD on filer01 and filer02:

[email protected] ~# service drbd start
[email protected] ~# service drbd start

Set the upper drbd resources primary on filer01:

[email protected] ~# drbdsetup /dev/drbd0 primary -o
[email protected] ~# drbdsetup /dev/drbd1 primary -o

Create the DRBD Metadata on the stacked resource:

[email protected] ~# drbdadm --stacked create-md meta-U
[email protected] ~# drbdadm --stacked create-md data-U

Enable the stacked resource:

[email protected] ~# drbdadm --stacked up meta-U
[email protected] ~# drbdadm --stacked up data-U

At this point DRBD will recognize the inconsistent data and start to sync from filer03.

[email protected] ~# service drbd status

service drbd status
drbd driver loaded OK; device status:
version: 8.3.7 (api:88/proto:86-91)
GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by [email protected], 2010-01-13 17:17:27
m:res cs ro ds p mounted fstype
... sync'ed: 0.2% (11740/11756)M
... sync'ed: 1.8% (11560/11756)M
... sync'ed: 35.7% (351792/538088)K
... sync'ed: 6.1% (509624/538032)K
0:meta SyncSource Primary/Secondary UpToDate/Inconsistent C
1:data SyncSource Primary/Secondary UpToDate/Inconsistent C
10:meta-U^^0 SyncTarget Secondary/Secondary Inconsistent/UpToDate C
11:data-U^^1 SyncTarget Secondary/Secondary Inconsistent/UpToDate C

For the lower resource meta and data filer01 is the SyncSource, while for the upper resource meta-U and data-U it is the SyncTarget. This shows us that the rebuild process has started.

Before you finish the synchronisation you can prepare the configuration for Openfiler and its Storage Services.


10.2 filer01 And filer02 Redo Configuration

As we have a fresh installation on filer01 and filer02 again we need to redo the configuration for Openfiler on these nodes like we have done it to filer02 and filer03 in the installation process before.

Openfiler Configuration:

mkdir /meta
mv /opt/openfiler/ /opt/openfiler.local
ln -s /meta/opt/openfiler /opt/openfiler

Samba/NFS/ISCSI/PROFTPD Configuration Files to Meta Partition:

service nfslock stop
service nfs stop
service rpcidmapd stop
umount -a -t rpc-pipefs
rm -rf /etc/samba/
ln -s /meta/etc/samba/ /etc/samba
rm -rf /var/spool/samba/
ln -s /meta/var/spool/samba/ /var/spool/samba
rm -rf /var/lib/nfs/
ln -s /meta/var/lib/nfs/ /var/lib/nfs
rm -rf /etc/exports
ln -s /meta/etc/exports /etc/exports
rm /etc/ietd.conf
ln -s /meta/etc/ietd.conf /etc/ietd.conf
rm /etc/initiators.allow
ln -s /meta/etc/initiators.allow /etc/initiators.allow
rm /etc/initiators.deny
ln -s /meta/etc/initiators.deny /etc/initiators.deny
rm -rf /etc/proftpd
ln -s /meta/etc/proftpd/ /etc/proftpd

We need to disable the services that are handled by heartbeat again:

[email protected] ~# chkconfig --level 2345 heartbeat on
[email protected] ~# chkconfig --level 2345 drbd on
[email protected] ~# chkconfig --level 2345 openfiler off
[email protected] ~# chkconfig --level 2345 open-iscsi off

[email protected] ~# chkconfig --level 2345 heartbeat on
[email protected] ~# chkconfig --level 2345 drbd on
[email protected] ~# chkconfig --level 2345 openfiler off
[email protected] ~# chkconfig --level 2345 open-iscsi off


10.3 Retake Resources And Run Cluster Again


When the synchronisation process has finished we can prepare the cluster now for rerunning the services on filer01. If you run the services for the cluster on filer03 ( Step 11.) you have to stop these services like described in Step 11.1 before you can continue.

Set the stacked resource primary on filer01:

[email protected] ~# drbdadm --stacked primary meta-U
[email protected] ~# drbdadm --stacked primary data-U

Mount the meta Partition and generate a new haresource file with openfiler:

[email protected] ~# mount -t ext3 /dev/drbd10 /meta
[email protected] ~# service openfiler restart

Now login into and start/stop some service you don't use to regenerate the /etc/ha.d/haresource file.

Then we can copy this file to filer02, start the heartbeat services on both machines and do a takeover.

[email protected] ~# service openfiler stop
[email protected] ~# service heartbeat start
[email protected] ~# service heartbeat start
[email protected] ~# /usr/lib/heartbeat/hb_takeover

After the network and filesystem mounts have happened you should see everything running fine again under the cluster IP

You can check this by trying to login to Try a manual failover on filer02 now, too.

[email protected] ~# /usr/lib/heartbeat/hb_takeover


11. Use Replication Node As Main Node

There are scenarios where you want to use the replication node probably for delivering the Storage so you can run services till you recover the hardware for filer01 and filer02. This can even be done when filer01 and filer02 are recovering from filer03.

Initiate the drbd resource as primary and start the partitions:

[email protected] ~# drbdadm primary meta-U
[email protected] ~# drbdadm primary data-U
[email protected] ~# mount -t ext3 /dev/drbd10 /meta
[email protected] ~# /etc/ha.d/resource.d/LVM data start

At this point we are able to start openfiler and the services we need, but we need the virtual IP which the cluster used to deliver services first. We use the resource.d scripts from heartbeat to do this.

[email protected] ~# /etc/ha.d/resource.d/IPaddr start

Then start all the services you need on filer03:

[email protected] ~# service openfiler start


11.1 Finished Replication, How To Turn Replication Node In Standby Again

First disable the services that you started on the machine ( openfiler, iscsi, etc. ):

[email protected] ~# service openfiler stop

Give up the cluster IP by using the resource.d scripts from heartbeat again.

[email protected] ~# /etc/ha.d/resource.d/IPaddr stop

Unmount the partitions and bring drbd in secondary mode.

[email protected] ~# umount /dev/drbd10
[email protected] ~# /etc/ha.d/resource.d/LVM data stop
[email protected] ~# drbdadm secondary meta-U
[email protected] ~# drbdadm secondary data-U

After this you can retake all services from filer01 like you found in Step 10.3.


12. Add Another Storage Partition

12GBs aren't that much so you might want to add more Storage at a later point to your cluster.

This is a very easy process in which you first shutdown the passive nodes and built in your additional storage and then create a LVM Partition on it with fdisk like described in Step 2. Note: You don't need to add another Linux Type Partition for configuration files, only another LVM Partition.

After this you add your new partition to the drbd.conf file on each node.

Add this to the drbd.conf file on filer01 and exchange it to filer02 and filer03.

resource data2 {
 on filer01 {
  device /dev/drbd2;
  disk /dev/sdc1;
  meta-disk internal;
 on filer02 {
  device /dev/drbd2;
  disk /dev/sdc1;
  meta-disk internal;
resource data2-U {
 stacked-on-top-of data2 {
  device /dev/drbd12;
 on filer03 {
  device /dev/drbd12;
  disk /dev/sdc1;
  meta-disk internal;

Note: filer01 must be the active for this to work!

Create the metadata on the lower resource before we can start the upper resource again.

[email protected] ~# drbdadm create-md data2

[email protected] ~# drbdadm create-md data2

Start the lower resource:

[email protected] ~# drbdadm up data2

[email protected] ~# drbdadm up data2

Make it primary:

[email protected] ~# drbdsetup /dev/drbd2 primary -o

Create the upper resource and make it primary, too.

[email protected] ~# drbdadm --stacked create-md data2-U
[email protected] ~# drbdadm --stacked up data2-U [email protected] ~# drbdsetup /dev/drbd12 primary -o

Create the meta-data on filer03 and start the resource:

[email protected] ~# drbdadm create-md data2-U
[email protected] ~# drbdadm up data2-U

After this we are ready to add the new device to our existing LVM Device and increase our storage. Note: It's out of scope of this manual to resize the storage that you actually use on it.

Now we create a PV on the new stacked resource device and add it to the existing VolumeGroup:

[email protected] ~# pvcreate /dev/drbd12
[email protected] ~# vgextend data /dev/drbd12

Don't forget to add your new device to your heartbeat configuration:

<?xml version="1.0" ?>
<clustering state="on" />
<nodename value="filer01" />
<resource value="MailTo::[email protected]::ClusterFailover"/>
<resource value="IPaddr::" />
<resource value="IPaddr::" />
<resource value="drbdupper::meta-U">
<resource value="drbdupper::data-U">
<resource value="drbdupper::data2-U">
<resource value="LVM::data">
<resource value="Filesystem::/dev/drbd10::/meta::ext3::defaults,noatime">
<resource value="MakeMounts"/>

Recreate the /etc/ha.d/haresource like we've done before by restarting some unused service over the Openfiler GUI, exchange this new haresource file to filer02.

After this you can log into your openfiler cluster IP and use the extended data storage. Instead of increasing you could just create another VolumeGroup. Refer to Step 6 for this.


Misc: Openfiler iSCSI Citrix Xen Modifications

Openfiler has some problems with the Storage created by Citrix Xen, so after a reboot you are going to have problems to add and find your LUNs. The main problem for this seems to be the AOE ( ATA over ethernet ) Service, which can be disabled with this command. Do this on all 3 nodes.

chkconfig --level 2345 aoe off

Another problem seems to be in the discovery off LVM Devices with Openfiler, the lvm config i posted is good to use for a system with stacked resources, but probably not right for a drbd only system, the drbd documentations mention the following lvm configurations for drbd and lvm which will only show the drbd or drbd stacked resources to lvm.

filter = [ "a|drbd.*|", "r|.*|" ]


filter = [ "a|drbd1[0-9]|", "r|.*|" ]

like you found in this howto. This will allow that the devices /dev/drbd10 - /dev/drbd19 are exposed to lvm. If you need more devices you have to change your lvm configuration regarding to this. You can find the example configurations in the drbd documentation here.

Edit the /etc/rc.sysinit file on Line 333-337 and comment out these lines:


     if [ -x /sbin/lvm.static ]; then
                if /sbin/lvm.static vgscan --mknodes --ignorelockingfailure > /dev/null 2>&1 ; then
                        action $"Setting up Logical Volume Management:" /sbin/lvm.static vgchange -a y --ignorelockingfailure


#    if [ -x /sbin/lvm.static ]; then
#                if /sbin/lvm.static vgscan --mknodes --ignorelockingfailure > /dev/null 2>&1 ; then
#                        action $"Setting up Logical Volume Management:" /sbin/lvm.static vgchange -a y --ignorelockingfailure
#                fi
#        fi

Restart your filers now to make the changes happen. You should be fine discovering the iSCSI LUNs with your Citrix Xen systems now.


Misc: Notes About Openfiler Clusters

Not all services are HA with this setup, some original configuration files which can be modified by openfiler remain on the single nodes partitions. In the starting process you can add these files to the meta partition.These services are:

  • /etc/ldap.conf
  • /etc/openldap/ldap.conf
  • /etc/ldap.secret
  • /etc/nsswitch.conf
  • /etc/krb5.conf

At the point of writing this howto rpath Linux ( Openfiler is based on this ) has heartbeat version 2.1.3 which would in theory be able to create n+1 clusters, but I haven't found anything about even basic crm cluster configurations being succesfully running. I tried out to create cib.xml files with the onboard script /usr/lib/heartbeat/ but the cluster did not start with them.

If you finished all steps of this howto succesfully it's time to take one of your favourite drinks, you earned it.

Share this page:

0 Comment(s)

Add comment


From: wayner