Skip to content

Pulling a JournalSpace ... almost

Wednesday, I had this very friendly mail in my INBOX, from mdam:

QUOTE:

This is an automatically generated mail message from mdadm running on tosca

A Fail event had been detected on md device /dev/md1.

Faithfully yours, etc.


There was little reason to panic, since I had a spare disk installed already. A few man- and mdadm commands later, the array was resyncing. I walked away from the screen, confident that there was nothing to worry about.

How wrong could I be! Somewhere at 99% of the recovery process, one of the existing disks threw a read error. This was too much for md, and to punish me it marked two disks as spares, one as failed and only one as still working. RAID5 may be designed to deal with broken disks, but it does have its limits. ;-) This is where it got scary, especially since this machine is, say, 1300km away from me.

With some remote help, I managed to get the machine rebooted in single user mode with an sshd running. The disks were all available again, but the superblocks were sufficiently broken that md didn't want to construct the LVM array anymore. (Fortunately the rootfs array was not a problem!)

Anyway, I spent that night copying the raw partitions over to another machine. Tried to construct an array there, but things were untable, md often tried to resync the partitions (bad idea), or simply didn't want to run it for me. Instead of trying to make this go, I decided to write a little tool to do the dirty work for me. And that, reader, is why I'm writing this. :-)

To my surprise, I couldn't find any tool to do this on FreshMeat. Sure, when I search for "RAID recovery" on Google, I get plenty of ads for "Professional RAID recovery tools", but I don't want to pay $$$$ for a program that XORs a couple of GBs of data for me. :-P

So, behold: raidrec. Comes with no documentation, other than this blog article and the (long) comment at the top of the file. I hope it'll be useful for someone some day... It was for me. I have all my data back, LVM picked up all my volumes perfectly, and a few fscks later I managed to migrate all my VMs to OpenVZ running on a desktop elsewhere in the house. At least ruby (my workstation here in Dublin) isn't the only desktop machine being a part-time gaast.net webserver anymore. :-/

If anyone didn't get the JournalSpace reference (which fortunately isn't very accurate here anyway): http://journalspace.com/

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

Tony Perrie on :

Just thought you should know, I'm reading your post whilst riding a bus in the middle of Yosemite (near Curry Village).

Wilmer on :

I do not believe that! :-P When I was at Yosemite, my mobile phone didn't work for two days...

Add Comment