Linux software RAID hot swap disk replacement

One of BitFolk’s servers in the US has had first one and then two dead disks for quite some time. It has a 4 disk software RAID-10, so by pure luck it was still running. Obviously as soon as a disk breaks you really should replace it, preferably with a hot spare. I was very lucky that the second disk failure wasn’t from the same half of the RAID-10 (resulting in downtime and restore from backup). There’s no customer data or customer-facing services on this machine though, so I let it slide for far too long.

Yesterday morning Graham was visiting the datacenter and kindly agreed to replace the disks for me. As it happens I don’t have that much experience of software RAID since the production machines I’ve worked on tend to have hardware RAID and the home ones tend not to be hot swap. It didn’t go entirely smoothly, but I think it was my fault.

The server chassis doesn’t support drive identification (e.g. by turning a light on) so I had to generate some disk activity so that Graham could see which drive lights weren’t blinking. It was easy enough for him to spot that slots 0 and 1 were still blinking away with slots 2 and 3 dead. I checked /proc/mdstat to ensure that those disks weren’t still present in any of the arrays. If they had been then I would have done:

$ sudo mdadm --fail /dev/mdX /dev/sdbX

to remove each one.

They weren’t present, so I gave Graham the go-ahead to pull the hot swap drive trays out.

At first the server didn’t notice anything. I thought this was bad as I would like it to notice! This was confirmed to be bad when all disk IO blocked and the load went through the roof.

I think what I had forgotten to do was to remove the devices from the SCSI subsystem as described in this article. So for me, it would have been something like:

$ for disk in sd{a,b,c,d}; do echo -n "$disk: "; ls -d /sys/block/$disk/device/scsi_device*; done
sda: /sys/block/sda/device/scsi_device:0:0:0:0
sdb: /sys/block/sdb/device/scsi_device:0:0:1:0
sdc: /sys/block/sdc/device/scsi_device:1:0:0:0
sdd: /sys/block/sdd/device/scsi_device:1:0:1:0

From /proc/mdstat I knew it was sdb and sdd that were broken. I think I should have done:

$ sudo sh -c 'echo "scsi remove-single-device" 0 0 1 0 > /proc/scsi/scsi'
$ sudo sh -c 'echo "scsi remove-single-device" 1 0 1 0 > /proc/scsi/scsi'

Anyway, at the time what I had was a largely unresponsive server. I used Magic Sysrq to sync, mount filesystems read-only and then reboot. In Cernio’s console this would normally be “~b” to send a break, but Xen uses “ctrl-o“. So that was ctrl-o s to sync, ctrl-o u to remount read-only and then ctrl-o b to reboot the system.

Graham had by then taken the dead disks out of the caddies and replaced with new, re-inserted them and powered the server back on.

Happily it did come back up fine, I then had to set about adding the new disks to the arrays.

I’d already been forewarned that the new disks had 488397168 sectors whereas the existing ones had 490234752 — both described as 250GB of course! A difference of some 890MiB, despite them both being from the same manufacturer, from the same range even. I didn’t bother adding a swap partition on the two new disks which made them just about big enough for everything else.

$ sudo mdadm --add /dev/md1 /dev/sdb1
mdadm: Cannot open /dev/sdb1: Device or resource busy

Oh dear!

After lengthy googling, this article gave me a clue.

$ sudo mulitpath -l
SATA_WDC_WD2500SD-01WD-WCAL72844661dm-1 ATA,WDC WD2500SD-01K
[size=233G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
 \_ 1:0:1:0 sdd 8:48  [active][undef]
SATA_WDC_WD2500SD-01WD-WCAL72802716dm-0 ATA,WDC WD2500SD-01K
[size=233G][features=0][hwhandler=0]
\_ round-robin 0 [prio=0][active]
 \_ 0:0:1:0 sdb 8:16  [active][undef]

There’s my disks!

Stopping multipath daemon didn’t help. Running multipath -F did help.

The usual

$ sudo mdadm --add /dev/md1 /dev/sdb1
$ sudo mdadm --add /dev/md1 /dev/sdd1
$ sudo mdadm --add /dev/md3 /dev/sdb3
$ sudo mdadm --add /dev/md3 /dev/sdd3
$ sudo mdadm --add /dev/md5 /dev/sdb5
$ sudo mdadm --add /dev/md5 /dev/sdd5

worked fine after that.

I hope that was useful to someone. I’ll be practising it some more on some spare hardware here to see if the fiddling with /proc/scsi/scsi really does work.

Update:

Dominic (author of the linked article about dm-multipath) says:

I think there’s also a “remove” or “delete” file you can echo to in the /sys device directory, bit more friendly than talking to /proc/scsi/scsi.

and provides this snippet for multipath.conf which should disable multipath:

# Blacklist all devices by default. Remove this to enable multipathing
# on the default devices.
blacklist {
        devnode "*"
}

Leave a Reply

Your email address will not be published. Required fields are marked *