becks.strugglers.net

From Strugglers
Jump to: navigation, search

http://gallery.strugglers.net/albums/Hardware/becks.sized.jpg

  • AMD Sempron(tm) Processor 3100+
  • 512MB RAM
  • nVidia nForce3-based motherboard
  • 100Mbit onboard ethernet (forcedeth)
  • 3xMaxtor 120GB, 1xSeagate 120GB, configured mostly as RAID5

RAID mishap, 2005-11-07

Here's an email I sent to a mailing list regarding this, then just now I thought other people might be interested, if only to point and laugh.

Date: Tue, 8 Nov 2005 23:04:37 +0000
From: Andy Smith <andy@lug.org.uk>
Subject: Software RAID slight problem

I have a cheap fileserver which when I built it I decided I'd use
4x120GB Maxtor SATA disks, cos they were cheap.  Yes -- Maxtor --
you can stop laughing now.

On Monday morning, some 11 months after the machine was built,
/dev/sdb decided it didn't want to read from a 50 or so of its
sectors and that it had reached its maximal sector reallocation
limit, and was very unwell, and would degrade my array.  I might
have known sooner with smartd, but smartd does not work with libata
SATA drives in Linux currently without a kernel patch.

Anyway so I immediately ordered a replacement and figured I wouldn't
scrape the bottom of the barrel this time, and ordered a Seagate
120GB.  Now I'm sure some of you have problems with Seagate too but
let's just agree they aren't as shonky as Maxtor and get on with the
tale.

I identified sdb, removed it and threw it in a pile marked "to swear
at and then RMA" and stuck in the Seagate.  Here's what I get when I
boot the machine now:

scsi0 : sata_nv
  Vendor: ATA       Model: Maxtor 6Y120M0    Rev: YAR5
  Type:   Direct-Access                      ANSI SCSI revision: 05
scsi1 : sata_nv
  Vendor: ATA       Model: ST3120827AS       Rev: 3.42
  Type:   Direct-Access                      ANSI SCSI revision: 05
scsi2 : sata_nv
  Vendor: ATA       Model: Maxtor 6Y120M0    Rev: YAR5
  Type:   Direct-Access                      ANSI SCSI revision: 05
scsi3 : sata_nv
  Vendor: ATA       Model: Maxtor 6Y120M0    Rev: YAR5
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 240121728 512-byte hdwr sectors (122942 MB)
SCSI device sdb: 234441648 512-byte hdwr sectors (120034 MB)
SCSI device sdc: 240121728 512-byte hdwr sectors (122942 MB)
SCSI device sdd: 240121728 512-byte hdwr sectors (122942 MB)

Anyone spotted the current thorn in my side yet?

Yes, thanks Seagate, your "120GB" hard disk is 2908MB smaller than the
3 "120GB" Maxtors.

So I haven't even bothered trying to partition it the same as
sd[acd] yet as it's not going to bloody work is it.

Fortunately, I can probably bodge this.  You see, the partition
table of each of the 3 Maxtors is like this:

Partition       Start sector    End sector      Purpose
1               63              144584          /boot
2               144585          10635029        /stripe
3               10635029        229472460       everything else

/boot is a RAID-1 on sda1 and sdb1 with sdc1 and sdd1 being hot
spares.[1] (md0)

/stripe is/was a RAID-0 across sd[abcd]2 for use as scratch space.
(md1)

"everything else" is a RAID-5 of /dev/sd[abcd]3 which I use as an
LVM PV and carve up lots of LVs for my root filesystem and all other
things.

So the situation right now is:

$ cat /proc/mdstat
Personalities : [raid0] [raid1] [raid5] [multipath]
md2 : active raid5 sdd3[3] sdc3[2] sda3[0]
      344208384 blocks level 5, 256k chunk, algorithm 2 [4/3] [U_UU]

md0 : active raid1 sdd1[1] sdc1[2] sda1[0]
      72192 blocks [2/2] [UU]

unused devices: <none>

i.e. md0 (/boot) has rebuilt itself onto sdd1 and has sdc1 as a hot
spare.  md1 (/stripe) is totally screwed and was commented out of
/etc/fstab since it is now dead.  There wasn't anything on it
anyway; it really was scratch space.  md2 is running as a degraded
RAID-5 with sdb3 missing.

I figure I can create sdb3 first and make it the same size as
partition 3 on the other disks.  I can then compromise by taking the
excess out of sdb2, and not bother about sdb1 at all.  md1 will end
up using less than the full extent of sd[acd]2 since sdb2 will still
be smaller than those others, but it's only scratch space so I don't
care.

But what's the easiest way to do this?  I can't start creating
partitions at the end can I?  Or maybe I can, in something other
than fdisk.

It would no doubt be easier to create sdb1 first as the correct size
to match sd[acd]3 then use the rest for sdb2, but then I end up with
a "less pretty" md setup where md2 is made of {sda3, sdb1, sdc3,
sdd3}.

So what would you guys do? :)

Answers of the form "remove all Maxtor disks, replace with all the
same drives from one sane vendor, and restore from backups" not
welcome thanks ;)

Oh, and, will I need to do anything special (like shrink sd[acd]2)
when trying to rebuild the stripe (md1) with the smaller patition
from sdb?

Cheers,
Andy

[1] Did it like that because I wanted /boot on a RAID 1, but I keep it
    read only most of the time and didn't see the point in having it
    a 4-way RAID 1.  But then, there isn't much to do with two
    70-odd MB disk partitions so I made them hot spares.

Suggestions

Neil Brown suggested I use cfdisk to create the correctly-sized partitions, and pointed out that md's RAID-0 doesn't need its devices to be all the same size. Hugo Mills and Adrian Bridgett both of HantsLUG also suggested cfdisk.

"Tim" also from HantsLUG suggested that I replace all the Maxtor drives with Seagate ones. It's not clear how that would have fixed my immediate problem and it would have been costly.

"Seanie" suggested nuking it all and recreating from backups. I have backups but they are all over the place due to perhaps foolishly building a fileserver bigger than I can comfortably backup. As a result this would take a long time, so it was a last resort really.

Simon Amor suggested using the 120GB Seagate for something else and buying a 160GB one instead. This would have taken a few days longer and given that I had 215GB of data running off a degraded RAID-5 I really wanted this sorted out as soon as possible. Also I intend to upgrade this server perhaps as soon as Christmas anyway.

Resolution

I chose the cfdisk solution and everything is up and working redundantly again now.