Go Back   HowtoForge Forums | HowtoForge - Linux Howtos and Tutorials > Linux Forums > Installation/Configuration

Do you like HowtoForge? Please consider supporting us by becoming a subscriber.
Reply
 
Thread Tools Display Modes
  #1  
Old 9th February 2006, 03:39
ryoken ryoken is offline
Member
 
Join Date: Feb 2006
Posts: 33
Thanks: 0
Thanked 0 Times in 0 Posts
Question md RAID reconstruction hints & tips?

this may sound like quite a trivial question, but could someone please confirm whether I need to unmount the hard drives before performing RAID reconstruction using the mdadm tool? should i be booting into single user mode 1st? or should i boot up using a live cd, such as knoppix? which is the accepted or safest method? how would i accomplish this?

Here's the deal: I've been experimenting with Linux RAID (md) and wanted to improve the availability of my server. So when everyone's asleep, I can replace the faulty drive with a new one.

I have /dev/sda and /dev/sdb mirrored (RAID1). Both sda and sdb have a "/" [md0] and "swap" [md1] partition. Also, both sda & sdb's MBR have GRUB installed. Now, when sdb gets unplugged (after powering off of course!), the system still boots into Linux Reconnecting sdb and disconnecting sda reveals the same results - flawless booting!

All is good so far. Now, lets plug sda & sdb back and boot into Linux. Running "cat /proc/mdstat" and "mdadm --detail /dev/md0" reveals a degraded RAID array (with sdb flagged as faulty/foreign). So using mdadm again, we can perform a "hot insert" of sdb. After a few moments, we can confirm (through /proc/mdstat and mdadm) that rebuilding was completed. OK, onto rebooting the system. First reboot after reconstruction seems flawless. So we shutdown the system again, and unplug sda again.

Bad news this time around. Powering on the system, we immediately notice GRUB's failed attempt to load the linux kernel, citing crc errors. And I thought this RAID1 mirror was perfect - even after reconstruction. So what went wrong? Here's GRUB's error message FYI:

Booting 'Debian GNU/Linux, kernel 2.6.8-2-386'

[..snip..]

Uncompressing Linux...

crc error

-- System halted


Am I correct in suspecting that mounted drives can't be mirrored (especially the boot blocks)? Is it possible that mounted drives have "locked" files or open "handles" which cannot be mirrored? Or am I way off the mark? Let me know. Looking forward to hearing your responses!!

edit: Whilst installing Debian 3.1 "Sarge", should you wait for RAID to finish syncing during the partitioning phase (by going to console "Alt+F2" & running "cat /proc/mdstat"), or is it OK to go right ahead and format them and install base system files while syncing in the background?

Last edited by ryoken; 9th February 2006 at 03:43.
Reply With Quote
Sponsored Links
  #2  
Old 9th February 2006, 09:49
ryoken ryoken is offline
Member
 
Join Date: Feb 2006
Posts: 33
Thanks: 0
Thanked 0 Times in 0 Posts
Default

after further testing, it seems i can no longer boot from md0

after md0 was resync'd with sda & sdb plugged in together, i rebooted and started getting errors after loading the kernel such as missing files and segmentation faults.

so i unplugged sdb again, and guess what?? i booted fine! so does this prove that raid arrays cannot be reconstructed when one of the drives are already mounted? or is there something more sinister going on here?

edit: yes, i did wait until md0 completed resyncing before i hit reboot...
Reply With Quote
  #3  
Old 9th February 2006, 11:45
falko falko is offline
Super Moderator
 
Join Date: Apr 2005
Location: Lneburg, Germany
Posts: 41,701
Thanks: 1,900
Thanked 2,748 Times in 2,579 Posts
Default

I found this about your CRC error: http://aplawrence.com/SME/raidfailure.html
__________________
Falko
--
Download the ISPConfig 3 Manual! | Check out the ISPConfig 3 Billing Module!

FB: http://www.facebook.com/howtoforge

nginx-Webhosting: Timme Hosting | Follow me on:
Reply With Quote
  #4  
Old 9th February 2006, 12:33
ryoken ryoken is offline
Member
 
Join Date: Feb 2006
Posts: 33
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Quote:
Originally Posted by falko
I found this about your CRC error: http://aplawrence.com/SME/raidfailure.html
hi falko. thanks for the link. all my hard drives are new so i don't think it's a hardware failure.
Reply With Quote
  #5  
Old 9th February 2006, 12:53
mphayesuk mphayesuk is offline
Senior Member
 
Join Date: Sep 2005
Location: UK, East Midlands
Posts: 517
Thanks: 1
Thanked 3 Times in 3 Posts
Send a message via MSN to mphayesuk
Default

Just a quick question as I am thinking of doing this.... I take it that the right way of doing the raid is through linux as you have done create duplicate partitions on both drives and then raid them together using software. Its just I thought that if your moptherboard has onboard raid cant you install everything to the first harddrive and then let the onboard raid mirror to the other hard drive or does it not work like that.... Any advice you can give would be helpful.

In your situation it sounds like you have to raid partitions working... swap and / ... when you repaired the raid by syncing did you do both raids swap and / or just one of them.... justing thinking out load that if you did not do both then perhaps that caused your problem..... dont know much about this yet I am only trying it today.....

If you have any pointers please share....

Thanks
Reply With Quote
  #6  
Old 9th February 2006, 13:50
ryoken ryoken is offline
Member
 
Join Date: Feb 2006
Posts: 33
Thanks: 0
Thanked 0 Times in 0 Posts
Default

Quote:
Originally Posted by mphayesuk
Just a quick question as I am thinking of doing this.... I take it that the right way of doing the raid is through linux as you have done create duplicate partitions on both drives and then raid them together using software. Its just I thought that if your moptherboard has onboard raid cant you install everything to the first harddrive and then let the onboard raid mirror to the other hard drive or does it not work like that.... Any advice you can give would be helpful.

In your situation it sounds like you have to raid partitions working... swap and / ... when you repaired the raid by syncing did you do both raids swap and / or just one of them.... justing thinking out load that if you did not do both then perhaps that caused your problem..... dont know much about this yet I am only trying it today.....

If you have any pointers please share....

Thanks
hi mphayesuk! yes, ive set mine up using md software raid provided by linux. even though i do have a motherboard with onboard raid (aka fakeraid; see http://linuxmafia.com/faq/Hardware/sata.html), the Debian Sarge 2.6 stock kernel does not support dmraid natively (or at least not during installation). therefore, you will always see two hard drives despite being setup otherwise using the onboard raid's bios. FYI, i'm using a Silicon Image 3112 onboard raid controller.

during debian install, i have done exactly what you've described. /dev/sda1 and /dev/sdb1 combine to become md0. /dev/sda2 and /dev/sdb2 combine to become md1. in turn, md0 becomes "/" and md1 becomes "swap".

good question about md1 (swap). from memory, it seems as though md1 was automagically sync'd after plugging in both sda & sdb. maybe that's because its designated as swap. maybe its a bug. hopefully someone can shed some light on this. i say from memory because since then, ive reformatted the drives and only have a "/" - no swap.

this second install means that i now have only md0. unfortunately, these segmentation faults still occur after simulating a hardware fault (as explained in above posts) - i suspect sdb in md0 is corrupted. given this, i think it's safe to say swap hasn't caused the problem here.

hope this extra info helps!
Reply With Quote
  #7  
Old 14th February 2006, 07:17
ryoken ryoken is offline
Member
 
Join Date: Feb 2006
Posts: 33
Thanks: 0
Thanked 0 Times in 0 Posts
Cool Problem Solved!!! :-)

ive finally figured out what was going wrong. 1st, let me clear up any misconceptions. linux software raid (provided by md) works well - it allows you to rebuild a raid1 array when they the drives are mounted. Yes, it is perfectly safe to rebuild/sync an array when drives are mounted.

the problems i experienced with crc errors, segfaults, missing files, etc. was caused by data corruption!! falko, you were on the right track with your link about the crc errors. although the data corruption was NOT caused by old/aged hardware, it was actually a "bug" in the implementation of SATA on my motherboard.

basically, if your motherboard is based on the nForce 2 chipset and it has an onboard Silicon Image 3112 (3114 too?) SATA controller (and im sure there are many people out here using this combination!), then you will experience data corruption on your SATA drives. the fix? update your motherboard BIOS to latest version and apply this BIOS setting (located in Integrated Peripherals):

EXT-P2P's Discard Time = 1ms

And thats it. No more data corruption. No more errors after resyncing/rebuilding the raid arrays! so md was not the cause of the problem after all! and it took me more than a week to figure this out. i wish the motherboard manual was more informative about this BIOS setting (the motherboard was released after the problem was resolved months ago!).

mphayesuk, you may see multiple partitions (eg. md0, md1, etc.) depending on how you set it up. also unless the kernel (and/or dmraid if it is available) has support for your raid controller, you will still see your hard drives individually (regardless of how you set it up in the RAID BIOS).

thanks again for everyone's help! i hope my description above will help others with similar problems in the future!
Reply With Quote
  #8  
Old 14th February 2006, 14:56
mphayesuk mphayesuk is offline
Senior Member
 
Join Date: Sep 2005
Location: UK, East Midlands
Posts: 517
Thanks: 1
Thanked 3 Times in 3 Posts
Send a message via MSN to mphayesuk
Default

ryoken can you explain step by step what you did.... I am still having trouble with this. This is what I am assumening you did.

DID NOT - setup raid on the motherboard ie did not use ctrl+A to access the raid controller and configure raid1

YOU DID - use the partition options in linux install to mirror/copy the partitions one both drives.

After linux is installed you installed other software to make raid work.

Thanks
Reply With Quote
  #9  
Old 16th February 2006, 03:22
ryoken ryoken is offline
Member
 
Join Date: Feb 2006
Posts: 33
Thanks: 0
Thanked 0 Times in 0 Posts
Arrow

Quote:
Originally Posted by mphayesuk
ryoken can you explain step by step what you did.... I am still having trouble with this. This is what I am assumening you did.

DID NOT - setup raid on the motherboard ie did not use ctrl+A to access the raid controller and configure raid1

YOU DID - use the partition options in linux install to mirror/copy the partitions one both drives.

After linux is installed you installed other software to make raid work.

Thanks
Yes, I DID NOT setup raid on the motherboard.

Yes, I DID use the partition options in linux install to mirror/copy the partitions one both drives.

BUT after linux was installed, I DID NOT install other software to make raid work. software raid support (md) was already provided during debian installation, hence raid was working immediately after i finished installing. nothing to reconfigure/fiddle around with afterwards.

this howto pretty much sums up what i did to get md software raid working:

http://emidio.planamente.ch/pages/li...t_lvm_raid.php

Please note that i DID NOT set up LVM - i skipped those steps.

hope this helps...
Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
HOWTO: SUSE 10.0 and Software RAID a.k.a FakeRAID crushton HOWTO-Related Questions 35 16th January 2009 20:40
How to setup Debian linux on a Dell server with RAID 1 Hans HOWTO-Related Questions 7 11th May 2006 01:10
Debian linux and RAID 1 Hans Installation/Configuration 1 28th December 2005 23:26


All times are GMT +2. The time now is 23:09.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.