How To Set Up Software RAID1 On A Running System (Incl. GRUB2 Configuration) (Ubuntu 10.04) - Page 4

9 Testing

Now let's simulate a hard drive failure. It doesn't matter if you select /dev/sda or /dev/sdb here. In this example I assume that /dev/sdb has failed.

To simulate the hard drive failure, you can either shut down the system and remove /dev/sdb from the system, or you (soft-)remove it like this:

mdadm --manage /dev/md0 --fail /dev/sdb1
mdadm --manage /dev/md1 --fail /dev/sdb2
mdadm --manage /dev/md2 --fail /dev/sdb3

mdadm --manage /dev/md0 --remove /dev/sdb1
mdadm --manage /dev/md1 --remove /dev/sdb2
mdadm --manage /dev/md2 --remove /dev/sdb3

Shut down the system:

shutdown -h now

Then put in a new /dev/sdb drive (if you simulate a failure of /dev/sda, you should now put /dev/sdb in /dev/sda's place and connect the new HDD as /dev/sdb!) and boot the system. It should still start without problems.

Now run

cat /proc/mdstat

and you should see that we have a degraded array:

root@server1:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sda3[0]
      4242368 blocks [2/1] [U_]

md1 : active raid1 sda2[0]
      499648 blocks [2/1] [U_]

md0 : active raid1 sda1[0]
      498624 blocks [2/1] [U_]

unused devices: <none>
root@server1:~#

The output of

fdisk -l

should look as follows:

root@server1:~# fdisk -l

Disk /dev/sda: 5368 MB, 5368709120 bytes
255 heads, 63 sectors/track, 652 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000246b7

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          63      498688   fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sda2              63         125      499712   fd  Linux raid autodetect
Partition 2 does not end on cylinder boundary.
/dev/sda3             125         653     4242432   fd  Linux raid autodetect
Partition 3 does not end on cylinder boundary.

Disk /dev/sdb: 5368 MB, 5368709120 bytes
255 heads, 63 sectors/track, 652 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/md0: 510 MB, 510590976 bytes
2 heads, 4 sectors/track, 124656 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/md0 doesn't contain a valid partition table

Disk /dev/md1: 511 MB, 511639552 bytes
2 heads, 4 sectors/track, 124912 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/md1 doesn't contain a valid partition table

Disk /dev/md2: 4344 MB, 4344184832 bytes
2 heads, 4 sectors/track, 1060592 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/md2 doesn't contain a valid partition table
root@server1:~#

Now we copy the partition table of /dev/sda to /dev/sdb:

sfdisk -d /dev/sda | sfdisk --force /dev/sdb

root@server1:~# sfdisk -d /dev/sda | sfdisk --force /dev/sdb
Checking that no-one is using this disk right now ...
OK

Disk /dev/sdb: 652 cylinders, 255 heads, 63 sectors/track

sfdisk: ERROR: sector 0 does not have an msdos signature
 /dev/sdb: unrecognized partition table type
Old situation:
No partitions found
New situation:
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sdb1   *      2048    999423     997376  fd  Linux raid autodetect
/dev/sdb2        999424   1998847     999424  fd  Linux raid autodetect
/dev/sdb3       1998848  10483711    8484864  fd  Linux raid autodetect
/dev/sdb4             0         -          0   0  Empty
Warning: partition 1 does not end at a cylinder boundary
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
You have new mail in /var/mail/root
root@server1:~#

Afterwards we remove any remains of a previous RAID array from /dev/sdb...

mdadm --zero-superblock /dev/sdb1
mdadm --zero-superblock /dev/sdb2
mdadm --zero-superblock /dev/sdb3

... and add /dev/sdb to the RAID array:

mdadm -a /dev/md0 /dev/sdb1
mdadm -a /dev/md1 /dev/sdb2
mdadm -a /dev/md2 /dev/sdb3

Now take a look at

cat /proc/mdstat

root@server1:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sdb3[2] sda3[0]
      4242368 blocks [2/1] [U_]
      [===>.................]  recovery = 16.1% (683520/4242368) finish=0.6min speed=97645K/sec

md1 : active raid1 sdb2[2] sda2[0]
      499648 blocks [2/1] [U_]
        resync=DELAYED

md0 : active raid1 sdb1[1] sda1[0]
      498624 blocks [2/2] [UU]

unused devices: <none>
root@server1:~#

Wait until the synchronization has finished:

root@server1:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sdb3[1] sda3[0]
      4242368 blocks [2/2] [UU]

md1 : active raid1 sdb2[1] sda2[0]
      499648 blocks [2/2] [UU]

md0 : active raid1 sdb1[1] sda1[0]
      498624 blocks [2/2] [UU]

unused devices: <none>
root@server1:~#

Then install the bootloader on both HDDs:

grub-install /dev/sda
grub-install /dev/sdb

That's it. You've just replaced a failed hard drive in your RAID1 array.

 

10 Links

Share this page:

8 Comment(s)

Add comment

Comments

From: chandpriyankara at: 2010-07-22 11:59:55

This is a great tutorial on RAID....

 we are looking for implementing other raid systems as well

cheers.

 

From: Anonymous at: 2010-12-13 21:51:22

This tutorial work also for debian squeeze, only problem with grub, delete recordfail and replace set root='(md0)' with set root='(md/0)'

From: Alexandre Gambini at: 2011-03-04 18:34:27

In my try of implementacion of raid, the better choice was chance /etc/default/grub in option and uncomment this line GRUB_DISABLE_LINUX_UUID=true, and grub work fine for me

Thanks for the Tutorial, is great job

From: at: 2011-08-15 12:08:57

Before failing a drive (testing) open a second terminal window to monitor mdstat. In that window run this command "watch cat /proc/mdstat", if it is rebuilding, you must let it finish or you might kill your project. You can also monitor, in real time, other actions like failing partitions, etc...

 A wonderful project, a wonderful way to learn linux. Thank you.

From: ecellingsworth at: 2011-11-09 03:53:43

This tutorial assumes you are issuing commands as root. If instead you are issuing commands as a less privileged user by using sudo, remember that you need to issue a separate sudo for both sfdisk commands in the piped command. Else you will get a "permission denied" error.

sudo sfdisk -d /dev/sda | sudo sfdisk --force /dev/sdb

I used this tutorial months ago to get my raid array started. A drive failed and I returned to this page today to remember how to rebuild a new drive. Forgetting the sudo tripped me up for a while. Good tutorial. I'm glad I took the time to set up the raid array. It saved me this time.

From: MC at: 2012-12-04 13:38:33

I replaced a failing /dev/sda, and i put the old put /dev/sdb in /dev/sda's place

But it doesn't restart, It simply displays GRUB on boot.

Before shutting it down I did install GRUB on /dev/sdb

 I had to put the failing drive back in but it will probably fail soon.

 any help? maybe i have to flag it as boootable of do something in the bios?

 thanks!

From: jlinkels at: 2014-02-15 21:59:51

In this tutorial it is explained like "failing" a device is sufficient test to see if an array is still operational or bootable.

The operational issue is fine, the bootable is not.

If you made a mistake or forgot to install the boot sector on both drives, the array will boot with a mdadm "failed" device, but it will not boot when a drive is disconnected, defective or gone.

So I strongly recommend that you actually disconnect one drive and see if the system boots. Then after resyncing, disconnect the other disk and try booting. 

Although failing and removing a device in mdadm is a good way to see if RAID is operational and can handle a disk failure during operation, it doesn't tell whether you correctly installed the boot loader. Often disks fail after a power cycle (as all hardware does...) and you don't want just to see a blinking cursor.

jlinkels


From: Bogdan STORM at: 2014-08-07 04:52:24

Thank you for putting all this information together for everyone.

Very helpful.