How To Install Ubuntu 8.04 With Software RAID1

This short guide explains how you can configure software RAID1 during the initial installation of an Ubuntu 8.04 ("Hardy Heron") system.

During Ubuntu installation:

From the “Partitions Disks” dialog box, select “Manually edit the partition table”.
Select the first disk (”sda”).
Say yes to “Create a new empty partition table on this device?”.
Use the dialog boxes to create one primary partition large enough to hold the root filesystem.
For “How to use this partition” select “physical volume for RAID“, not the default “Ext3 journaling file system”.
Make the partition bootable.
Use the dialogs to create one other primary partition taking up the remaining disk space. Later this will be used for swap.
For “How to use this partition” select “physical volume for RAID“, not the default “Ext3 journaling file system” and not “swap area”.
Repeat the above steps to create identical partitions on the second drive. Remember to mark partition one on both drives as “bootable”.
Once the partitions are configured, at the top of the “Partition Disks” main dialog select “Configure Software RAID”.
When asked “Write the changes to the storage devices and configure RAID” select “Yes”.
For “Multidisk configuration actions” select “Create MD device”.
For “Multidisk device type” select “RAID1”.
For “Number of active devices for the RAID1 array” enter “2”.
For “Number of spare devices for the RAID1 array” enter “0” (zero).
When asked to select “Active devices for the RAID1 multidisk device” select both /dev/sda1 and /dev/sdb1.
From the next dialog select “create MD device”.
Repeat the above steps to create an MD device that contains /dev/sda2 and /dev/sdb2.
Finally, from the dialog “Multidisk configuration actions” select “Finish”.

Next configure device md0 to be mounted as the “/” filesystem and device md1 to be mounted as swap:

From the “Partition Disks” dialog, move the cursor bar under “RAID device #0? and select “#1”.
Configure the device as an Ext3 filesystem mounted on /.

Making every drive bootable

Boot your freshly installed Ubuntu and do the following :

grub

device (hd1) /dev/sdb

root (hd1,0)

setup (hd1)

quit

Adding second HD to grub

vi /boot/grub/menu.lst

Add something like this :

### To boot if sda fails ###
title           Ubuntu 8.04.1, kernel 2.6.24-19-generic /dev/sda fail
root            (hd1,0)
kernel          /boot/vmlinuz-2.6.24-19-generic root=/dev/md0 ro quiet splash
initrd          /boot/initrd.img-2.6.24-19-generic
### End mod ###

Replace kernel and initrd filename if necessary. Reboot and try if Ubuntu boot on the second hard drive.

Care and feeding

Having two drives configured in a RAID1 mirror allows the server to continue to function when either drive fails. When a drive fails completely, the kernel RAID driver automatically removes it from the array.

However, a drive may start having seek errors without failing completely. In that situation the RAID driver may not remove it from service and performance will degrade. Luckily you can manually remove a failing drive using the “mdadm” command. For example, to manually mark both of the RAID devices on drive sda as failed:

mdadm /dev/md0 –-fail /dev/sda1
mdadm /dev/md1 –-fail /dev/sda2

The above removes both RAID devices on drive sda from service, leaving only the partitions on drive sdb active.

Removing a failed drive

When a drive fails it is vital to act immediately. RAID drives have an eerie habit of all failing around the same time, especially when they are identical models purchased together and put into service at the same time. Even drives from different manufacturers sometimes fail at nearly the same time… probably because they all experience the same environmental factors (power events, same number of power downs, the same janitor banging the vacuum into the server every night, etc.)

When Ubuntu sees that RAID has been configured, it automatically runs the mdadm command in “monitor mode” to watch each device and send email to root when a problem is noticed. You can also manually inspect RAID status using commands like the following:

cat /proc/mdstat
mdadm –-query –-detail /dev/md0
mdadm –-query –-detail /dev/md1

It’s also wise to use “smartctl” to monitor each drive’s internal failure stats. However as noted in a recent analysis by Google (PDF link), drives are perfectly capable to dying without any warning showing in their SMART monitors.

To replace a drive that has been marked as failed (either automatically or by using “mdadm –fail”), first remove all partitions on that drive from the array. For example to remove all partitions from drive sda:

mdadm –-remove /dev/md0 /dev/sda1
mdadm –-remove /dev/md1 /dev/sda2

Once removed it is safe to power down the server and replace the failed drive.

Preparing the new drive

Once system as been rebooted with the new unformatted replacement drive in place, some manual intervention is required to partition the drive and add it to the RAID array.

The new drive must have an identical (or nearly identical) partition table to the other. You can use fdisk to manually create a partition table on the new drive identical to the table of the other, or if both drives are identical you can use the “sfdisk” command to duplicate the partition. For example, to copy the partition table from the second drive “sdb” onto the first drive “sda”, the sfdisk command is as follows:

    sfdisk -–d /dev/sdb | sfdisk /dev/sda

Warning: be careful to specify the right source and destinations drives when using sfdisk or your could blank out the partition table on your good drive.

Once the partitions have been created, you can add them to the corresponding RAID devices using “mdadm –add” commands. For example:

mdadm -–add /dev/md0 /dev/sda1
mdadm -–add /dev/md1 /dev/sda2

Once added, the Linux kernel immediately starts re-syncing contents of the arrays onto the new drive. You can monitor progress via “cat /proc/mdstat”. Syncing uses idle CPU cycles to avoid overloading a production system, so performance should not be affected too badly. The busier the server (and larger the partitions), the longer the re-sync will take.

Note that you don’t have to wait until all partitions are re-synced… servers can be on-line and in production while syncing is in progress: no data will be lost and eventually all drives will become synchronized.

Creating a new array under Linux with free space

First use fdisk to create a partition on both hard drive in the remaining space (exact space). You must make sure the filesystem is set to Raid (code fd).

Then do something like :

sudo mdadm --create --verbose /dev/md3 --level=1 --raid-devices=2 /dev/sda5 /dev/sdb5

Summary

Linux software RAID is far more cost effective and flexible than hardware RAID, though it is more complex and requires manual intervention when replacing drives. In most situations, software RAID performance is as good (and often better) than an equivalent hardware RAID solution, all at a lower cost and with greater flexibility. When all you need are mirrored drives, software RAID is often the best choice.