btrfs compression wins

Some quite good btrfs compression results from my backup hosts (which back up customer data).

Type       Perc     Disk Usage   Uncompressed Referenced
TOTAL       64%       68G         105G         1.2T
none       100%       24G          24G         434G
zlib        54%       43G          80G         797G
Type       Perc     Disk Usage   Uncompressed Referenced
TOTAL       74%       91G         123G         992G
none       100%       59G          59G         599G
lzo         50%       32G          63G         393G
Type       Perc     Disk Usage   Uncompressed Referenced
TOTAL       73%       16G          22G         459G
none       100%       12G          12G         269G
lzo         40%      4.1G          10G         190G
Type       Perc     Disk Usage   Uncompressed Referenced
TOTAL       71%      105G         148G         1.9T
none       100%       70G          70G         910G
zlib        40%       24G          60G         1.0T
lzo         58%       10G          17G          17G

So that’s 398G that takes up 280G, a 29.6% reduction.

The “none” type is incompressible files such as media that’s already compressed. I started off with lzo compression but I’m switching to zlib now as it compresses more and this data is rarely accessed so I’m not too concerned about performance. I need newer kernels on these before I can try zstd.

I’ve had serious concerns about btrfs before based on issues I’ve had using it at home, but these were mostly around multiple device usage. Here they get a single block device that has redundancy underneath so the only remotely interesting thing that btrfs is doing here is the compression.

Might try some offline deduplication next.

Another disappointing btrfs experience

I’ve been using btrfs on my home fileserver for about 4½ years. I am not entirely happy with it and kind of wish I never did it; I will certainly not be introducing it anywhere else. I’m also pretty lazy though, which probably explains why I haven’t ripped it out and replaced it with something else yet.

I’ve had a few problems with it over the years. To be fair I’ve never lost any data; it’s really the availability aspects of it which I feel just aren’t ready yet. When I use multiple storage devices it’s generally to increase availability. I don’t expect device failure to stop me doing what I need to do, at least for small amounts of device failure.

Unfortunately btrfs has consistently not lived up to these expectations. Almost every single-disk failure I’ve had in the past has resulted in an “outage” of some sort. As this is just our data, at home, it may be strange to think of it as an outage, but that’s what it is. Our data became unavailable in some way for some period of time.

This time around, one of the drives started throwing up “Currently unreadable (pending)” and “Offline uncorrectable” sectors a few days ago. That means that there’s areas of the drive that it cannot read. Initially there were just a small number, and a scrub came back clean so that suggested the problem sectors were at that time outside of any filesystem.

In a more critical setting I’d have spare drives available and would just swap them, but for home use I’m usually comfortable with forcing the drive to reallocate these by forcing a write, before ordering a replacement if the problem doesn’t go away. Worst case, I have backups.

After a day or so though, the number of problem sectors was increasing and it was obvious the drive was going to die fairly soon. I ordered a replacement. About 6 hours before the replacement arrived the drive completely stopped responding.

Now, this drive was at the time one of five in the btrfs filesystem, and the filesystem has a raid1 storage policy so there should have been no issue with one device going missing. But apparently there was a problem. btrfs sits spewing the kernel log with errors about lost writes to a device that’s no longer there; the filesystem goes read-only.

The replacement drive arrives, but with the filesystem read-only I can’t add it. I can’t even unmount the filesystem (says it is busy but lsof doesn’t see any users). Nope, I had to reboot the fileserver, at which point the filesystem wouldn’t mount at all because you have to give it the degraded mount option if you want it to mount with any devices missing.

Add the replacement drive, btrfs device remove missing /path/to/fs to kick off a remove of the dead device. Things are at least up and running read-write while this is going on. In fact it’s still going on, because there was 1.2TiB of data on the dead device and reconstructing it is painfully slow. As I write this we’re now about 9 hours in and there’s still about 421GiB to go.

So, it’s not terrible. No data was lost (probably). A short outage due to a required reboot. But it is kind of disappointing and not really how I want to be spending my time just because a single HDD slipped its mortal coil. I am massively thankful that the operating system of that fileserver is still on four other HDDs on ext4+lvm+md and never give me any trouble. Otherwise I’d have to be booting into a rescue OS to fix this sort of thing. When the thing you’re glad of is that you didn’t use a filesystem, that isn’t a great advert for that filesystem.

I should probably try to find some time to play (again) with ZFS-on-Linux. I did actually give it a go last year but got bogged down trying to compare its performance against btrfs and ext4+lvm+md using fio, which proved quite difficult to do, and I moved on to other things.

One of the things that initially attracted me to btrfs is the possibility of using a mish-mash of differently-sized drives. Due to BitFolk constantly replacing hardware I have in my possession plenty of HDDs of differing sizes that are individually perfectly serviceable, but would be awkward to try to match up into identical sizes for conventional RAID arrays. Over the years of this btrfs filesystem it had started out with mostly 250G drives and just before this failure it was 1x 1TB, 3x 2TB and 1x 3TB.

I had thought that ZFS requires every device to be the same capacity (i.e. it would only use the smallest capacity) but I’ve since been informed that ZFS will just use the capacity of the smallest device in the vdev. So assuming mirror vdevs, I’d just need to pair the drives up (or accept that the capacity will be that of the smaller of the two).

That doesn’t seem too onerous at all, when considering the advantages that ZFS would bring. I’m most interested in the self-healing (checksums) and the storage tiering (through using faster devices like SSDs for L2ARC and ZIL). btrfs doesn’t have a good solution for tiering yet, unless you are insane and want to play with bcache(fs).

So, yeah, should stop being lazy and crack on with ZFS again. In my copious free time.

Home file server / disk enclosure troubles

On March 26th the file server here at home started resetting one of its ATA links over and over, in a couple of bursts each hour. Like this:

Mar 26 13:38:34 specialbrew kernel: [5137674.573170] ata5.03: exception Emask 0x0 SAct 0xc0 SErr 0x400000 action 0x6
Mar 26 13:38:34 specialbrew kernel: [5137674.579164] ata5.03: irq_stat 0x00060002, device error via SDB FIS
Mar 26 13:38:34 specialbrew kernel: [5137674.585127] ata5.03: SError: { Handshk }
Mar 26 13:38:34 specialbrew kernel: [5137674.591016] ata5.03: failed command: WRITE FPDMA QUEUED
Mar 26 13:38:34 specialbrew kernel: [5137674.596962] ata5.03: cmd 61/80:30:80:79:10/00:00:14:00:00/40 tag 6 ncq 65536 out
Mar 26 13:38:34 specialbrew kernel: [5137674.596962]          res 41/84:00:9f:79:10/00:00:14:00:00/40 Emask 0x410 (ATA bus error) <F>
Mar 26 13:38:34 specialbrew kernel: [5137674.609065] ata5.03: status: { DRDY ERR }
Mar 26 13:38:34 specialbrew kernel: [5137674.615131] ata5.03: error: { ICRC ABRT }
Mar 26 13:38:34 specialbrew kernel: [5137674.621135] ata5.03: failed command: WRITE FPDMA QUEUED
Mar 26 13:38:34 specialbrew kernel: [5137674.627123] ata5.03: cmd 61/80:38:00:7a:10/00:00:14:00:00/40 tag 7 ncq 65536 out
Mar 26 13:38:34 specialbrew kernel: [5137674.627123]          res 98/d7:00:00:00:00/00:00:00:70:98/00 Emask 0x2 (HSM violation)
Mar 26 13:38:34 specialbrew kernel: [5137674.639241] ata5.03: status: { Busy }
Mar 26 13:38:34 specialbrew kernel: [5137674.645349] ata5.03: error: { ICRC UNC AMNF IDNF ABRT }
Mar 26 13:38:34 specialbrew kernel: [5137674.651591] ata5.03: hard resetting link
Mar 26 13:38:34 specialbrew kernel: [5137674.976944] ata5.03: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Mar 26 13:38:34 specialbrew kernel: [5137674.984592] ata5.03: configured for UDMA/100
Mar 26 13:38:34 specialbrew kernel: [5137674.990852] ata5: EH complete

Or this:

Mar 26 13:53:36 specialbrew kernel: [5138576.767785] ata5.02: exception Emask 0x0 SAct 0x100000 SErr 0x400000 action 0x6
Mar 26 13:53:36 specialbrew kernel: [5138576.774595] ata5.02: irq_stat 0x00060002, device error via SDB FIS
Mar 26 13:53:36 specialbrew kernel: [5138576.781330] ata5.02: SError: { Handshk }
Mar 26 13:53:36 specialbrew kernel: [5138576.788067] ata5.02: failed command: WRITE FPDMA QUEUED
Mar 26 13:53:36 specialbrew kernel: [5138576.794853] ata5.02: cmd 61/80:a0:00:20:a8/00:00:25:00:00/40 tag 20 ncq 65536 out
Mar 26 13:53:36 specialbrew kernel: [5138576.794853]          res 41/84:80:00:20:a8/00:00:25:00:00/00 Emask 0x410 (ATA bus error) <F>
Mar 26 13:53:36 specialbrew kernel: [5138576.808592] ata5.02: status: { DRDY ERR }
Mar 26 13:53:36 specialbrew kernel: [5138576.815527] ata5.02: error: { ICRC ABRT }
Mar 26 13:53:36 specialbrew kernel: [5138576.822504] ata5.02: hard resetting link
Mar 26 13:53:36 specialbrew kernel: [5138577.145789] ata5.02: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Mar 26 13:53:36 specialbrew kernel: [5138577.154616] ata5.02: configured for UDMA/100
Mar 26 13:53:36 specialbrew kernel: [5138577.161849] ata5: EH complete
Mar 26 13:53:38 specialbrew kernel: [5138578.402262] ata5.03: failed command: WRITE FPDMA QUEUED
Mar 26 13:53:38 specialbrew kernel: [5138578.409516] ata5.03: cmd 61/d8:f0:a0:5b:11/00:00:00:00:00/40 tag 30 ncq 110592 out
Mar 26 13:53:38 specialbrew kernel: [5138578.409516]          res 99/d7:00:00:00:00/00:00:00:e0:99/00 Emask 0x2 (HSM violation)
Mar 26 13:53:38 specialbrew kernel: [5138578.424086] ata5.03: status: { Busy }
Mar 26 13:53:38 specialbrew kernel: [5138578.431389] ata5.03: error: { ICRC UNC AMNF IDNF ABRT }
Mar 26 13:53:38 specialbrew kernel: [5138578.438850] ata5.03: hard resetting link
Mar 26 13:53:38 specialbrew kernel: [5138578.765827] ata5.03: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Mar 26 13:53:38 specialbrew kernel: [5138578.774446] ata5.03: configured for UDMA/100
Mar 26 13:53:38 specialbrew kernel: [5138578.781795] ata5: EH complete

Despite the masses of link resets a couple of times an hour it all seemed to be working still, and it was a few days before I looked into it further.

The hardware is an HP Microserver with a Sans Digital 8-bay SATA disk enclosure connected over 2x eSATA. The Microserver’s four internal SATA slots are used for something else; these messages relate to the two eSATA links (ata5 and ata6) to the disk enclosure. There’s seven disks in there, six of which form a btrfs filesystem which holds the majority of files we use here at home (yes, I’ve heard about ZFSonLinux. No, I’m not interested in using it yet).

I have never been too happy with that Sans Digital enclosure:

  • Half the LEDs on the front have never worked. So you only get link lights for one SATA link.
  • As a more general example of the above, the mechanical build quality just seems bad. It’s really flimsy.
  • It’s pretty slow to detect disks and spews errors in the dmesg as it does.
  • It’s done this thing before where it resets the SATA link over and over, but the problem mysteriously went away when everything was power-cycled.

So anyway, plan of action:

  1. Try rebooting everything.
  2. If no joy, buy a new enclosure from somewhere and try to make it not another one of those.

It was 30th March before I got around to giving that a go. There was a short period of terror where the Microserver abruptly declared

No boot device. Insert boot floppy.

but amusingly this turned out to be because the BIOS was set to prefer booting off a USB disk, and I had left my Sansa Clip plugged in to charge it. It was trying to boot off of an mp3 player.

Sadly once we did actually get past GruB things were not much happier. SATA links resetting pretty much continuously, to the point were it couldn’t even get to the end of Linux attaching the disks as SCSI devices. Unplugging the eSATA cables entirely left the Microserver happy to boot with no issues other than all my data being missing!

A process of elimination (take all disks out, boot, put disks back in one by one and boot each time until everything complains) allowed me to identify one particular disk that was really upsetting things. I was able to boot without that disk inserted, and I could mount the btrfs filesystem in degraded mode as it’s a RAID-1 for both data and metadata.

At this point I decided to push my luck and try inserting that one problematic disk again. As soon as I did it all went crazy again, and the link resets caused a bunch more devices to be ejected by Linux. I was left with a btrfs filesystem with three remaining devices that had mounted itself read-only due to read errors.

So, is it the disk or is the the enclosure?

I don’t have a spare enclosure so it’s rather difficult to stress test that, but bearing in mind that I’m not too happy with it anyway I decided to buy a new enclosure anyway and investigate the existing one later. This time I decided to try this thing, which appears to be a CFi-B7886CM. No I’ve never heard of them either but I doubt it is any different to what Sans Digital buy and slap their branding on.

It’s quite a bit cheaper than the Sans Digital unit, though obviously it only has one eSATA link. I’m not too bothered by that as I doubt that any four of my disks were exceeding 3Gbps or whatever, so yeah, whatever, I’ll put six disks on one link. It might even be a benefit if I want to add another one of these enclosures later, as there’s no more room in the Microserver for more eSATA cards.

While that was on order I got things back up and running with the five disks and had the btrfs filesystem running degraded (yes, I do have backups). I attached that suspect disk to another machine via a USB dock and was able to repeatedly read it all, perform SMART long tests on it without error, etc.

The disk caddies are almost identical, and are interchangeable.
The disk caddies are almost identical, and are interchangeable.

The new enclosure arrived today and my first impressions of it are fairly positive. In terms of appearance and dimensions it’s almost identical to the Sans Digital one. In fact the disk caddies look identical bar the colour of the plastic handle, and are interchangeable, which saved me the hassle of having to unscrew 28 screws (seven disks, four per disk) and screw them in again. It definitely feels bit more sturdy though.

On boot for some reason I expected the btrfs filesystem to not automatically mount. It had been running with a device missing for nearly two days, and in that state you have to do

# mount -odegraded /srv/tank

to get it to mount, so I was thinking that even with all disks inserted it would decide that the out-of-date disk was unacceptable and require manual intervention to mount it. Well, I was wrong about that. It automatically mounted it and then immediately started complaining:

Apr  1 14:46:14 specialbrew kernel: [  161.210994] BTRFS info (device sdj): csum failed ino 259 off 12288 csum 1424469249 expected csum 1836639932
Apr  1 14:46:14 specialbrew kernel: [  161.211321] BTRFS info (device sdj): csum failed ino 259 off 24576 csum 3532125205 expected csum 1877817807
Apr  1 14:46:19 specialbrew kernel: [  165.447070] verify_parent_transid: 9 callbacks suppressed
Apr  1 14:46:19 specialbrew kernel: [  165.457108] parent transid verify failed on 3438446813184 wanted 1144554 found 1141678
Apr  1 14:46:19 specialbrew kernel: [  165.468310] repair_io_failure: 16 callbacks suppressed
Apr  1 14:46:19 specialbrew kernel: [  165.478409] BTRFS: read error corrected: ino 1 off 3438446813184 (dev /dev/sdk sector 1525564240)
Apr  1 14:46:19 specialbrew kernel: [  165.491461] parent transid verify failed on 3438446821376 wanted 1144554 found 1141678
Apr  1 14:46:19 specialbrew kernel: [  165.501892] BTRFS: read error corrected: ino 1 off 3438446821376 (dev /dev/sdk sector 1525564256)
Apr  1 14:46:19 specialbrew kernel: [  165.512262] parent transid verify failed on 3438336446464 wanted 1144475 found 1141571
Apr  1 14:46:19 specialbrew kernel: [  165.522746] BTRFS: read error corrected: ino 1 off 3438336446464 (dev /dev/sdk sector 1525348680)
Apr  1 14:46:34 specialbrew kernel: [  180.411201] parent transid verify failed on 3892014911488 wanted 1144992 found 1142134
Apr  1 14:46:34 specialbrew kernel: [  180.422607] BTRFS: read error corrected: ino 1 off 3892014911488 (dev /dev/sdk sector 150856)
Apr  1 14:46:34 specialbrew kernel: [  180.442075] parent transid verify failed on 3892014706688 wanted 1144992 found 1142133
Apr  1 14:46:34 specialbrew kernel: [  180.453157] BTRFS: read error corrected: ino 1 off 3892014706688 (dev /dev/sdk sector 150456)
Apr  1 14:46:34 specialbrew kernel: [  180.475472] parent transid verify failed on 3892014710784 wanted 1144992 found 1142133
Apr  1 14:46:34 specialbrew kernel: [  180.486688] BTRFS: read error corrected: ino 1 off 3892014710784 (dev /dev/sdk sector 150464)
Apr  1 14:46:34 specialbrew kernel: [  180.508824] parent transid verify failed on 3892014714880 wanted 1144992 found 1142133
Apr  1 14:46:34 specialbrew kernel: [  180.519368] BTRFS: read error corrected: ino 1 off 3892014714880 (dev /dev/sdk sector 150472)
Apr  1 14:47:22 specialbrew kernel: [  229.019490] parent transid verify failed on 3438634471424 wanted 1144757 found 1141901
Apr  1 14:47:22 specialbrew kernel: [  229.056008] BTRFS: read error corrected: ino 1 off 3438634471424 (dev /dev/sdk sector 1525930760)
It's nice to have all the blinkenlights blinking.
It’s nice to have all the blinkenlights blinking.

This is apparently what you would normally expect from btrfs fixing the out-of-date metadata from the other copy.

Everything seems okay; there are no read errors and no more link resets, and a cursory inspection of important data is matching up with checksums from last good backup. Oh, and all the SATA link/activity LEDs work!

I have to now:

  • do a btrfs scrub to make sure all the data on sdk is correct.
  • investigate the Sans Digital enclosure to see if I can replicate its brokenness.
While we're at it, is there a brand of label tape that doesn't dry up and peel off?
While we’re at it, is there a brand of label tape that doesn’t dry up and peel off?

Electronically I can’t tell if it’s any different to the old enclosure (aside from only having one SATA link, not two).

Comparing the dmesg between “SCSI subsystem initialized” and the last SCSI disk being attached, for the old and the new enclosure I see a few interesting differences. The new one is a little bit faster, and has fewer weird errors (failed to IDENTIFY, etc.).

Old:

[    5.818389] ata5.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9

New:

[    8.116807] ata6.15: Port Multiplier 1.2, 0x197b:0x575f r0, 15 ports, feat 0x5/0xf

So the new one supports Port Multiplier v1.2 as opposed to v1.1. Presumably that is good news. I wonder what feat 0x1/0x9 vs feat 0x5/0xf means?

With the old enclosure the SATA links seem to come up as a mixture of 1.5 and 3.0Gbps, and multiple times as the links are reset:

[    3.510120] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.510209] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    3.510290] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.510359] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    5.818090] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[    6.138331] ata5.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[    6.458331] ata5.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    6.778330] ata5.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    7.738323] ata5.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   10.082066] ata5.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   10.402304] ata5.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   11.778298] ata5.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   12.098296] ata5.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   13.058294] ata5.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   15.406036] ata5.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   15.726279] ata5.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   16.046275] ata5.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   17.418267] ata5.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   18.378264] ata5.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   20.586006] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   20.906281] ata6.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   21.226248] ata6.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   21.866245] ata6.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   22.506241] ata6.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   24.869982] ata6.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   25.190226] ata6.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   26.546216] ata6.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   27.186212] ata6.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   27.826212] ata6.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   30.169952] ata6.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   30.490216] ata6.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   31.866187] ata6.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   32.506184] ata6.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   33.146180] ata6.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)

The new one, they’re all 6.0Gbps (although the eSATA link itself seems to only be 3Gbps; not sure why that would be but it’s no worse anyway):

[    3.509782] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.513788] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.513878] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.513965] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    7.961748] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[    8.454063] ata6.00: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    8.822059] ata6.01: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    9.174057] ata6.02: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    9.510064] ata6.03: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    9.846061] ata6.04: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   10.182059] ata6.05: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   10.518057] ata6.06: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Hmm, probably should investigate why /dev/sd[a-d] all have their write cache disabled.

Update 2nd April:

The scrub seems to have gone okay:

scrub status for 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
scrub device /dev/sdh (id 2) history
        scrub started at Wed Apr  1 20:05:58 2015 and finished after 14642 seconds
        total bytes scrubbed: 383.42GiB with 0 errors
scrub device /dev/sdg (id 3) history
        scrub started at Wed Apr  1 20:05:58 2015 and finished after 14504 seconds
        total bytes scrubbed: 382.62GiB with 0 errors
scrub device /dev/sdf (id 4) history
        scrub started at Wed Apr  1 20:05:58 2015 and finished after 14436 seconds
        total bytes scrubbed: 383.00GiB with 0 errors
scrub device /dev/sdk (id 5) history
        scrub started at Wed Apr  1 20:05:58 2015 and finished after 21156 seconds
        total bytes scrubbed: 1.13TiB with 14530 errors
        error details: verify=10909 csum=3621
        corrected errors: 14530, uncorrectable errors: 0, unverified errors: 0
scrub device /dev/sdj (id 6) history
        scrub started at Wed Apr  1 20:05:58 2015 and finished after 5693 seconds
        total bytes scrubbed: 119.42GiB with 0 errors
scrub device /dev/sde (id 7) history
        scrub started at Wed Apr  1 20:05:58 2015 and finished after 5282 seconds
        total bytes scrubbed: 114.45GiB with 0 errors

Those dmesg excerpts.

Old enclosure:

[    3.010635] SCSI subsystem initialized
[    3.013325] libata version 3.00 loaded.
[    3.014937] ahci 0000:00:11.0: version 3.0
[    3.015245] ahci 0000:00:11.0: irq 42 for MSI/MSI-X
[    3.015350] ahci 0000:00:11.0: AHCI 0001.0200 32 slots 4 ports 3 Gbps 0xf impl SATA mode
[    3.015413] ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part 
[    3.016613] scsi0 : ahci
[    3.017073] scsi1 : ahci
[    3.017527] scsi2 : ahci
[    3.017891] scsi3 : ahci
[    3.018071] ata1: SATA max UDMA/133 abar m1024@0xfe5ffc00 port 0xfe5ffd00 irq 42
[    3.018123] ata2: SATA max UDMA/133 abar m1024@0xfe5ffc00 port 0xfe5ffd80 irq 42
[    3.018174] ata3: SATA max UDMA/133 abar m1024@0xfe5ffc00 port 0xfe5ffe00 irq 42
[    3.018224] ata4: SATA max UDMA/133 abar m1024@0xfe5ffc00 port 0xfe5ffe80 irq 42
[    3.510120] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.510209] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    3.510290] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.510359] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.511393] ata3.00: ATA-8: ST3320418AS, CC38, max UDMA/133
[    3.511446] ata3.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    3.512865] ata3.00: configured for UDMA/133
[    3.550857] ata2.00: ATA-7: ST3320620AS, 3.AAC, max UDMA/133
[    3.550910] ata2.00: 625134827 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    3.552655] ata4.00: ATA-7: ST3320620AS, 3.AAC, max UDMA/133
[    3.552708] ata4.00: 625134827 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    3.555002] ata1.00: ATA-7: ST3320620AS, 3.AAC, max UDMA/133
[    3.555055] ata1.00: 625134827 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    3.609166] ata2.00: configured for UDMA/133
[    3.610965] ata4.00: configured for UDMA/133
[    3.613304] ata1.00: configured for UDMA/133
[    3.613707] scsi 0:0:0:0: Direct-Access     ATA      ST3320620AS      C    PQ: 0 ANSI: 5
[    3.614932] scsi 1:0:0:0: Direct-Access     ATA      ST3320620AS      C    PQ: 0 ANSI: 5
[    3.615974] scsi 2:0:0:0: Direct-Access     ATA      ST3320418AS      CC38 PQ: 0 ANSI: 5
[    3.616662] scsi 3:0:0:0: Direct-Access     ATA      ST3320620AS      C    PQ: 0 ANSI: 5
[    3.619964] sata_sil24 0000:02:00.0: version 1.1
[    3.620839] scsi4 : sata_sil24
[    3.621229] scsi5 : sata_sil24
[    3.621391] ata5: SATA max UDMA/100 host m128@0xfe8ffc00 port 0xfe8f8000 irq 16
[    3.621446] ata6: SATA max UDMA/100 host m128@0xfe8ffc00 port 0xfe8fa000 irq 16
[    3.914252] Switched to clocksource tsc
[    5.818090] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[    5.818389] ata5.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
[    5.820283] ata5.00: hard resetting link
[    6.138331] ata5.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[    6.138411] ata5.01: hard resetting link
[    6.458331] ata5.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    6.458412] ata5.02: hard resetting link
[    6.778330] ata5.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    6.778409] ata5.03: hard resetting link
[    7.098325] ata5.03: SATA link down (SStatus 0 SControl 320)
[    7.098427] ata5.04: hard resetting link
[    7.418325] ata5.04: SATA link down (SStatus 0 SControl 320)
[    7.418427] ata5.05: hard resetting link
[    7.738323] ata5.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[    7.739205] ata5.00: ATA-8: ST2000DL003-9VT166, CC3C, max UDMA/133
[    7.739259] ata5.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32)
[    7.740177] ata5.00: configured for UDMA/100
[    7.850059] ata5.01: failed to IDENTIFY (I/O error, err_mask=0x11)
[    7.850116] ata5.15: hard resetting link
[    7.850161] ata5: controller in dubious state, performing PORT_RST
[   10.082066] ata5.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   10.082368] ata5.00: hard resetting link
[   10.402304] ata5.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   11.458039] ata5.01: hard resetting link
[   11.778298] ata5.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   11.778378] ata5.02: hard resetting link
[   12.098296] ata5.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   12.738033] ata5.05: hard resetting link
[   13.058294] ata5.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   13.060046] ata5.00: configured for UDMA/100
[   13.061547] ata5.01: ATA-8: ST500DM002-1BD142, KC45, max UDMA/133
[   13.061600] ata5.01: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   13.063284] ata5.01: configured for UDMA/100
[   13.174029] ata5.02: failed to IDENTIFY (I/O error, err_mask=0x11)
[   13.174086] ata5.15: hard resetting link
[   13.174131] ata5: controller in dubious state, performing PORT_RST
[   15.406036] ata5.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   15.406340] ata5.00: hard resetting link
[   15.726279] ata5.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   15.726359] ata5.01: hard resetting link
[   16.046275] ata5.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   17.098013] ata5.02: hard resetting link
[   17.418267] ata5.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   17.418347] ata5.03: hard resetting link
[   17.738266] ata5.03: SATA link down (SStatus 0 SControl 320)
[   17.738371] ata5.04: hard resetting link
[   18.058262] ata5.04: SATA link down (SStatus 0 SControl 320)
[   18.058364] ata5.05: hard resetting link
[   18.378264] ata5.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   18.380014] ata5.00: configured for UDMA/100
[   18.383049] ata5.01: configured for UDMA/100
[   18.383981] ata5.02: ATA-8: ST2000DL003-9VT166, CC3C, max UDMA/133
[   18.384034] ata5.02: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   18.385025] ata5.02: configured for UDMA/100
[   18.385136] ata5: EH complete
[   18.385504] scsi 4:0:0:0: Direct-Access     ATA      ST2000DL003-9VT1 CC3C PQ: 0 ANSI: 5
[   18.386442] scsi 4:1:0:0: Direct-Access     ATA      ST500DM002-1BD14 KC45 PQ: 0 ANSI: 5
[   18.387336] scsi 4:2:0:0: Direct-Access     ATA      ST2000DL003-9VT1 CC3C PQ: 0 ANSI: 5
[   20.586006] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   20.586306] ata6.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
[   20.588209] ata6.00: hard resetting link
[   20.906281] ata6.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   20.906362] ata6.01: hard resetting link
[   21.226248] ata6.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   21.226328] ata6.02: hard resetting link
[   21.546245] ata6.02: SATA link down (SStatus 0 SControl 320)
[   21.546348] ata6.03: hard resetting link
[   21.866245] ata6.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   21.866326] ata6.04: hard resetting link
[   22.186243] ata6.04: SATA link down (SStatus 0 SControl 320)
[   22.186345] ata6.05: hard resetting link
[   22.506241] ata6.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   22.507044] ata6.00: ATA-8: ST2000DM001-1CH164, CC26, max UDMA/133
[   22.507097] ata6.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32)
[   22.526916] ata6.00: configured for UDMA/100
[   22.637976] ata6.01: failed to IDENTIFY (I/O error, err_mask=0x11)
[   22.638033] ata6.15: hard resetting link
[   22.638077] ata6: controller in dubious state, performing PORT_RST
[   24.869982] ata6.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   24.870284] ata6.00: hard resetting link
[   25.190226] ata6.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   26.225956] ata6.01: hard resetting link
[   26.546216] ata6.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   26.865951] ata6.03: hard resetting link
[   27.186212] ata6.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   27.505948] ata6.05: hard resetting link
[   27.826212] ata6.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   27.828088] ata6.00: configured for UDMA/100
[   27.937960] ata6.01: failed to IDENTIFY (I/O error, err_mask=0x11)
[   27.938018] ata6.15: hard resetting link
[   27.938064] ata6: controller in dubious state, performing PORT_RST
[   30.169952] ata6.15: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[   30.170255] ata6.00: hard resetting link
[   30.490216] ata6.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[   31.545926] ata6.01: hard resetting link
[   31.866187] ata6.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   31.866267] ata6.02: hard resetting link
[   32.186183] ata6.02: SATA link down (SStatus 0 SControl 320)
[   32.186286] ata6.03: hard resetting link
[   32.506184] ata6.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   32.506265] ata6.04: hard resetting link
[   32.826180] ata6.04: SATA link down (SStatus 0 SControl 320)
[   32.826283] ata6.05: hard resetting link
[   33.146180] ata6.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
[   33.148097] ata6.00: configured for UDMA/100
[   33.149175] ata6.01: ATA-8: ST2000DM001-1CH164, CC26, max UDMA/133
[   33.149232] ata6.01: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   33.150077] ata6.01: configured for UDMA/100
[   33.150673] ata6.03: ATA-9: WDC WD30EZRX-00D8PB0, 80.00A80, max UDMA/133
[   33.151349] ata6.03: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   33.151980] ata6.03: configured for UDMA/100
[   33.152091] ata6: EH complete
[   33.152449] scsi 5:0:0:0: Direct-Access     ATA      ST2000DM001-1CH1 CC26 PQ: 0 ANSI: 5
[   33.153351] scsi 5:1:0:0: Direct-Access     ATA      ST2000DM001-1CH1 CC26 PQ: 0 ANSI: 5
[   33.154294] scsi 5:3:0:0: Direct-Access     ATA      WDC WD30EZRX-00D 0A80 PQ: 0 ANSI: 5
[   33.159340] udevd[79]: starting version 175
[   33.279358] pps_core: LinuxPPS API ver. 1 registered
[   33.279447] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[   33.281013] PTP clock support registered
[   33.288070] tg3.c:v3.137 (May 11, 2014)
[   33.305489] ACPI: bus type USB registered
[   33.305628] usbcore: registered new interface driver usbfs
[   33.305692] usbcore: registered new interface driver hub
[   33.334492] sd 2:0:0:0: [sdc] 625142448 512-byte logical blocks: (320 GB/298 GiB)
[   33.334600] sd 1:0:0:0: [sdb] 625134827 512-byte logical blocks: (320 GB/298 GiB)
[   33.334663] sd 4:0:0:0: [sde] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[   33.334713] sd 4:0:0:0: [sde] 4096-byte physical blocks
[   33.334769] sd 3:0:0:0: [sdd] 625134827 512-byte logical blocks: (320 GB/298 GiB)
[   33.334851] sd 0:0:0:0: [sda] 625134827 512-byte logical blocks: (320 GB/298 GiB)
[   33.335019] sd 1:0:0:0: [sdb] Write Protect is off
[   33.335071] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   33.335654] sd 3:0:0:0: [sdd] Write Protect is off
[   33.335738] sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[   33.335805] sd 0:0:0:0: [sda] Write Protect is off
[   33.335851] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   33.335931] sd 2:0:0:0: [sdc] Write Protect is off
[   33.335981] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[   33.335997] sd 4:0:0:0: [sde] Write Protect is off
[   33.336043] sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
[   33.336072] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   33.336272] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   33.336430] sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   33.336549] sd 3:0:0:0: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   33.336657] sd 2:0:0:0: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   33.338151] usbcore: registered new device driver usb
[   33.338392] sd 4:1:0:0: [sdf] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[   33.338448] sd 4:1:0:0: [sdf] 4096-byte physical blocks
[   33.338556] sd 4:1:0:0: [sdf] Write Protect is off
[   33.338602] sd 4:1:0:0: [sdf] Mode Sense: 00 3a 00 00
[   33.338631] sd 4:1:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   33.340427] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[   33.340790] ehci-pci: EHCI PCI platform driver
[   33.341526] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[   33.341963] ohci-pci: OHCI PCI platform driver
[   33.343818]  sdf: unknown partition table
[   33.344181] sd 4:1:0:0: [sdf] Attached SCSI disk
[   33.344271] sd 4:2:0:0: [sdg] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[   33.344322] sd 4:2:0:0: [sdg] 4096-byte physical blocks
[   33.344424] sd 4:2:0:0: [sdg] Write Protect is off
[   33.344470] sd 4:2:0:0: [sdg] Mode Sense: 00 3a 00 00
[   33.344497] sd 4:2:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   33.348760]  sdc: sdc1 sdc2 sdc3
[   33.348907] scsi6 : pata_atiixp
[   33.349487] sd 2:0:0:0: [sdc] Attached SCSI disk
[   33.349615] sd 5:0:0:0: [sdh] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[   33.349674] sd 5:0:0:0: [sdh] 4096-byte physical blocks
[   33.349777] sd 5:0:0:0: [sdh] Write Protect is off
[   33.349892] sd 5:0:0:0: [sdh] Mode Sense: 00 3a 00 00
[   33.349928] sd 5:0:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   33.351467] sd 5:1:0:0: [sdi] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[   33.351562] sd 5:1:0:0: [sdi] 4096-byte physical blocks
[   33.351659] sd 5:1:0:0: [sdi] Write Protect is off
[   33.351707] sd 5:1:0:0: [sdi] Mode Sense: 00 3a 00 00
[   33.351731] sd 5:1:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   33.354852]  sdb: sdb1 sdb2 sdb3
[   33.355394] sd 1:0:0:0: [sdb] Attached SCSI disk
[   33.355517] sd 5:3:0:0: [sdj] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
[   33.355570] sd 5:3:0:0: [sdj] 4096-byte physical blocks
[   33.355687] sd 5:3:0:0: [sdj] Write Protect is off
[   33.355746] sd 5:3:0:0: [sdj] Mode Sense: 00 3a 00 00
[   33.355774] sd 5:3:0:0: [sdj] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   33.356077]  sdd: sdd1 sdd2 sdd3
[   33.356565] sd 3:0:0:0: [sdd] Attached SCSI disk
[   33.356772]  sde: unknown partition table
[   33.357024] sd 4:0:0:0: [sde] Attached SCSI disk
[   33.358014]  sda: sda1 sda2 sda3
[   33.358470] sd 0:0:0:0: [sda] Attached SCSI disk
[   33.360879] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   33.361384] sd 1:0:0:0: Attached scsi generic sg1 type 0
[   33.361524] sd 2:0:0:0: Attached scsi generic sg2 type 0
[   33.362055] sd 3:0:0:0: Attached scsi generic sg3 type 0
[   33.362192] random: nonblocking pool is initialized
[   33.362992]  sdg: unknown partition table
[   33.363064] sd 4:0:0:0: Attached scsi generic sg4 type 0
[   33.363217] sd 4:1:0:0: Attached scsi generic sg5 type 0
[   33.363425] sd 4:2:0:0: [sdg] Attached SCSI disk
[   33.363684] sd 4:2:0:0: Attached scsi generic sg6 type 0
[   33.364163] sd 5:0:0:0: Attached scsi generic sg7 type 0
[   33.364262] scsi7 : pata_atiixp
[   33.364414] ata7: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xff00 irq 14
[   33.364482] ata8: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xff08 irq 15
[   33.364779] sd 5:1:0:0: Attached scsi generic sg8 type 0
[   33.364816] QUIRK: Enable AMD PLL fix
[   33.364856] ehci-pci 0000:00:12.2: EHCI Host Controller
[   33.364916] ehci-pci 0000:00:12.2: new USB bus registered, assigned bus number 1
[   33.364973] ehci-pci 0000:00:12.2: applying AMD SB700/SB800/Hudson-2/3 EHCI dummy qh workaround
[   33.365035] ehci-pci 0000:00:12.2: debug port 1
[   33.365143] ehci-pci 0000:00:12.2: irq 17, io mem 0xfe5ff800
[   33.365218] sd 5:3:0:0: Attached scsi generic sg9 type 0
[   33.372701] tg3 0000:03:00.0 eth0: Tigon3 [partno(BCM95723) rev 5784100] (PCI Express) MAC address d4:85:64:c9:e6:24
[   33.372767] tg3 0000:03:00.0 eth0: attached PHY is 5784 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
[   33.372820] tg3 0000:03:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[   33.372870] tg3 0000:03:00.0 eth0: dma_rwctrl[76180000] dma_mask[64-bit]
[   33.373201]  sdh: unknown partition table
[   33.373555] sd 5:0:0:0: [sdh] Attached SCSI disk
[   33.373944] ehci-pci 0000:00:12.2: USB 2.0 started, EHCI 1.00
[   33.374071] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[   33.374119] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   33.374169] usb usb1: Product: EHCI Host Controller
[   33.374214] usb usb1: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ehci_hcd
[   33.374260] usb usb1: SerialNumber: 0000:00:12.2
[   33.374691] hub 1-0:1.0: USB hub found
[   33.374781] hub 1-0:1.0: 5 ports detected
[   33.375334] ehci-pci 0000:00:13.2: EHCI Host Controller
[   33.375390] ehci-pci 0000:00:13.2: new USB bus registered, assigned bus number 2
[   33.375449] ehci-pci 0000:00:13.2: applying AMD SB700/SB800/Hudson-2/3 EHCI dummy qh workaround
[   33.375512] ehci-pci 0000:00:13.2: debug port 1
[   33.375607] ehci-pci 0000:00:13.2: irq 17, io mem 0xfe5ff400
[   33.376656]  sdi: unknown partition table
[   33.377076] sd 5:1:0:0: [sdi] Attached SCSI disk
[   33.385906] ehci-pci 0000:00:13.2: USB 2.0 started, EHCI 1.00
[   33.386167] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
[   33.386220] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   33.386270] usb usb2: Product: EHCI Host Controller
[   33.386316] usb usb2: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ehci_hcd
[   33.386362] usb usb2: SerialNumber: 0000:00:13.2
[   33.387005] hub 2-0:1.0: USB hub found
[   33.387265] hub 2-0:1.0: 5 ports detected
[   33.387879] ehci-pci 0000:00:16.2: EHCI Host Controller
[   33.387933] ehci-pci 0000:00:16.2: new USB bus registered, assigned bus number 3
[   33.387986] ehci-pci 0000:00:16.2: applying AMD SB700/SB800/Hudson-2/3 EHCI dummy qh workaround
[   33.388049] ehci-pci 0000:00:16.2: debug port 1
[   33.388139] ehci-pci 0000:00:16.2: irq 17, io mem 0xfe5ff000
[   33.397878] ehci-pci 0000:00:16.2: USB 2.0 started, EHCI 1.00
[   33.398037] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002
[   33.398090] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   33.398142] usb usb3: Product: EHCI Host Controller
[   33.398188] usb usb3: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ehci_hcd
[   33.398234] usb usb3: SerialNumber: 0000:00:16.2
[   33.398551] hub 3-0:1.0: USB hub found
[   33.398615] hub 3-0:1.0: 4 ports detected
[   33.399512] ohci-pci 0000:00:12.0: OHCI PCI host controller
[   33.399567] ohci-pci 0000:00:12.0: new USB bus registered, assigned bus number 4
[   33.399666] ohci-pci 0000:00:12.0: irq 18, io mem 0xfe5fe000
[   33.458012] usb usb4: New USB device found, idVendor=1d6b, idProduct=0001
[   33.458069] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   33.458122] usb usb4: Product: OHCI PCI host controller
[   33.458167] usb usb4: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ohci_hcd
[   33.458214] usb usb4: SerialNumber: 0000:00:12.0
[   33.458798] hub 4-0:1.0: USB hub found
[   33.458864] hub 4-0:1.0: 5 ports detected
[   33.459403] ohci-pci 0000:00:13.0: OHCI PCI host controller
[   33.460079] ohci-pci 0000:00:13.0: new USB bus registered, assigned bus number 5
[   33.460173] ohci-pci 0000:00:13.0: irq 18, io mem 0xfe5fd000
[   33.517983] usb usb5: New USB device found, idVendor=1d6b, idProduct=0001
[   33.518039] usb usb5: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   33.518093] usb usb5: Product: OHCI PCI host controller
[   33.518142] usb usb5: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ohci_hcd
[   33.518212] usb usb5: SerialNumber: 0000:00:13.0
[   33.519228] hub 5-0:1.0: USB hub found
[   33.519291] hub 5-0:1.0: 5 ports detected
[   33.519871] ohci-pci 0000:00:16.0: OHCI PCI host controller
[   33.519946] ohci-pci 0000:00:16.0: new USB bus registered, assigned bus number 6
[   33.520048] ohci-pci 0000:00:16.0: irq 18, io mem 0xfe5fc000
[   33.577996] usb usb6: New USB device found, idVendor=1d6b, idProduct=0001
[   33.578054] usb usb6: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   33.578105] usb usb6: Product: OHCI PCI host controller
[   33.578151] usb usb6: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ohci_hcd
[   33.578198] usb usb6: SerialNumber: 0000:00:16.0
[   33.578570] hub 6-0:1.0: USB hub found
[   33.578635] hub 6-0:1.0: 4 ports detected
[   34.261707]  sdj: unknown partition table
[   34.262244] sd 5:3:0:0: [sdj] Attached SCSI disk

And with the new enclosure:

[    3.014000] SCSI subsystem initialized
[    3.016739] libata version 3.00 loaded.
[    3.018309] ahci 0000:00:11.0: version 3.0
[    3.018588] ahci 0000:00:11.0: irq 42 for MSI/MSI-X
[    3.018684] ahci 0000:00:11.0: AHCI 0001.0200 32 slots 4 ports 3 Gbps 0xf impl SATA mode
[    3.018742] ahci 0000:00:11.0: flags: 64bit ncq sntf ilck pm led clo pmp pio slum part 
[    3.019841] scsi0 : ahci
[    3.020241] scsi1 : ahci
[    3.020705] scsi2 : ahci
[    3.021119] scsi3 : ahci
[    3.021317] ata1: SATA max UDMA/133 abar m1024@0xfe5ffc00 port 0xfe5ffd00 irq 42
[    3.021372] ata2: SATA max UDMA/133 abar m1024@0xfe5ffc00 port 0xfe5ffd80 irq 42
[    3.021427] ata3: SATA max UDMA/133 abar m1024@0xfe5ffc00 port 0xfe5ffe00 irq 42
[    3.021479] ata4: SATA max UDMA/133 abar m1024@0xfe5ffc00 port 0xfe5ffe80 irq 42
[    3.509782] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.513788] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.513878] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    3.513965] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    3.515180] ata3.00: ATA-8: ST3320418AS, CC38, max UDMA/133
[    3.515233] ata3.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    3.516651] ata3.00: configured for UDMA/133
[    3.554413] ata1.00: ATA-7: ST3320620AS, 3.AAC, max UDMA/133
[    3.554466] ata1.00: 625134827 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    3.554593] ata2.00: ATA-7: ST3320620AS, 3.AAC, max UDMA/133
[    3.554639] ata2.00: 625134827 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    3.557219] ata4.00: ATA-7: ST3320620AS, 3.AAC, max UDMA/133
[    3.557272] ata4.00: 625134827 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    3.612719] ata1.00: configured for UDMA/133
[    3.612949] ata2.00: configured for UDMA/133
[    3.613000] scsi 0:0:0:0: Direct-Access     ATA      ST3320620AS      C    PQ: 0 ANSI: 5
[    3.613707] scsi 1:0:0:0: Direct-Access     ATA      ST3320620AS      C    PQ: 0 ANSI: 5
[    3.614554] scsi 2:0:0:0: Direct-Access     ATA      ST3320418AS      CC38 PQ: 0 ANSI: 5
[    3.615496] ata4.00: configured for UDMA/133
[    3.615739] scsi 3:0:0:0: Direct-Access     ATA      ST3320620AS      C    PQ: 0 ANSI: 5
[    3.619004] sata_sil24 0000:02:00.0: version 1.1
[    3.619834] scsi4 : sata_sil24
[    3.620237] scsi5 : sata_sil24
[    3.620414] ata5: SATA max UDMA/100 host m128@0xfe8ffc00 port 0xfe8f8000 irq 16
[    3.620468] ata6: SATA max UDMA/100 host m128@0xfe8ffc00 port 0xfe8fa000 irq 16
[    3.921925] Switched to clocksource tsc
[    5.697760] ata5: SATA link down (SStatus 0 SControl 0)
[    7.961748] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[    8.116807] ata6.15: Port Multiplier 1.2, 0x197b:0x575f r0, 15 ports, feat 0x5/0xf
[    8.119113] ata6.00: hard resetting link
[    8.454063] ata6.00: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    8.454167] ata6.01: hard resetting link
[    8.822059] ata6.01: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    8.822162] ata6.02: hard resetting link
[    9.174057] ata6.02: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    9.174160] ata6.03: hard resetting link
[    9.510064] ata6.03: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    9.510168] ata6.04: hard resetting link
[    9.846061] ata6.04: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    9.846165] ata6.05: hard resetting link
[   10.182059] ata6.05: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   10.182164] ata6.06: hard resetting link
[   10.518057] ata6.06: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[   10.518162] ata6.07: hard resetting link
[   10.838099] ata6.07: SATA link down (SStatus 0 SControl 320)
[   10.838219] ata6.08: hard resetting link
[   11.158095] ata6.08: SATA link down (SStatus 0 SControl 320)
[   11.158213] ata6.09: hard resetting link
[   11.478093] ata6.09: SATA link down (SStatus 0 SControl 320)
[   11.478212] ata6.10: hard resetting link
[   11.798094] ata6.10: SATA link down (SStatus 0 SControl 320)
[   11.798214] ata6.11: hard resetting link
[   12.118088] ata6.11: SATA link down (SStatus 0 SControl 320)
[   12.118208] ata6.12: hard resetting link
[   12.438091] ata6.12: SATA link down (SStatus 0 SControl 320)
[   12.438209] ata6.13: hard resetting link
[   12.758087] ata6.13: SATA link down (SStatus 0 SControl 320)
[   12.758206] ata6.14: hard resetting link
[   13.078082] ata6.14: SATA link down (SStatus 0 SControl 320)
[   13.078746] ata6.00: ATA-9: WDC WD30EZRX-00D8PB0, 80.00A80, max UDMA/133
[   13.078802] ata6.00: 5860533168 sectors, multi 16: LBA48 NCQ (depth 31/32)
[   13.079454] ata6.00: configured for UDMA/100
[   13.083304] ata6.01: ATA-8: ST2000DM001-1CH164, CC26, max UDMA/133
[   13.083357] ata6.01: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   17.852068] ata6.01: configured for UDMA/100
[   17.855904] ata6.02: ATA-8: ST2000DM001-1CH164, CC26, max UDMA/133
[   17.855957] ata6.02: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   22.590353] ata6.02: configured for UDMA/100
[   22.596782] ata6.03: ATA-8: ST2000DL003-9VT166, CC3C, max UDMA/133
[   22.596838] ata6.03: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   22.614183] ata6.03: configured for UDMA/100
[   22.616206] ata6.04: ATA-8: ST500DM002-1BD142, KC45, max UDMA/133
[   22.616259] ata6.04: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   22.640259] ata6.04: configured for UDMA/100
[   22.646640] ata6.05: ATA-8: ST2000DL003-9VT166, CC3C, max UDMA/133
[   22.646693] ata6.05: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   22.664004] ata6.05: configured for UDMA/100
[   22.664550] ata6.06: ATA-9: WDC WD30EZRX-00D8PB0, 80.00A80, max UDMA/133
[   22.664603] ata6.06: 5860533168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[   22.665186] ata6.06: configured for UDMA/100
[   22.665437] ata6: EH complete
[   22.665865] scsi 5:0:0:0: Direct-Access     ATA      WDC WD30EZRX-00D 0A80 PQ: 0 ANSI: 5
[   22.666964] scsi 5:1:0:0: Direct-Access     ATA      ST2000DM001-1CH1 CC26 PQ: 0 ANSI: 5
[   22.668009] scsi 5:2:0:0: Direct-Access     ATA      ST2000DM001-1CH1 CC26 PQ: 0 ANSI: 5
[   22.668589] scsi 5:3:0:0: Direct-Access     ATA      ST2000DL003-9VT1 CC3C PQ: 0 ANSI: 5
[   22.669046] scsi 5:4:0:0: Direct-Access     ATA      ST500DM002-1BD14 KC45 PQ: 0 ANSI: 5
[   22.669519] scsi 5:5:0:0: Direct-Access     ATA      ST2000DL003-9VT1 CC3C PQ: 0 ANSI: 5
[   22.669972] scsi 5:6:0:0: Direct-Access     ATA      WDC WD30EZRX-00D 0A80 PQ: 0 ANSI: 5
[   22.674753] udevd[79]: starting version 175
[   22.718011] pps_core: LinuxPPS API ver. 1 registered
[   22.718099] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[   22.718474] PTP clock support registered
[   22.723563] tg3.c:v3.137 (May 11, 2014)
[   22.779806] tg3 0000:03:00.0 eth0: Tigon3 [partno(BCM95723) rev 5784100] (PCI Express) MAC address d4:85:64:c9:e6:24
[   22.779903] tg3 0000:03:00.0 eth0: attached PHY is 5784 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0])
[   22.779956] tg3 0000:03:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
[   22.780009] tg3 0000:03:00.0 eth0: dma_rwctrl[76180000] dma_mask[64-bit]
[   22.781303] ACPI: bus type USB registered
[   22.781448] usbcore: registered new interface driver usbfs
[   22.781515] usbcore: registered new interface driver hub
[   22.808386] usbcore: registered new device driver usb
[   22.809764] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[   22.810453] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[   22.810686] ehci-pci: EHCI PCI platform driver
[   22.810872] ohci-pci: OHCI PCI platform driver
[   22.823167] scsi6 : pata_atiixp
[   22.832627] sd 1:0:0:0: [sdb] 625134827 512-byte logical blocks: (320 GB/298 GiB)
[   22.832785] sd 1:0:0:0: [sdb] Write Protect is off
[   22.832834] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   22.832863] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   22.833349] sd 2:0:0:0: [sdc] 625142448 512-byte logical blocks: (320 GB/298 GiB)
[   22.833402] sd 5:0:0:0: [sde] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
[   22.833404] sd 5:0:0:0: [sde] 4096-byte physical blocks
[   22.833417] sd 3:0:0:0: [sdd] 625134827 512-byte logical blocks: (320 GB/298 GiB)
[   22.833504] sd 5:0:0:0: [sde] Write Protect is off
[   22.833507] sd 5:0:0:0: [sde] Mode Sense: 00 3a 00 00
[   22.833517] sd 3:0:0:0: [sdd] Write Protect is off
[   22.833520] sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[   22.833564] sd 5:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   22.833642] sd 3:0:0:0: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   22.833777] sd 0:0:0:0: [sda] 625134827 512-byte logical blocks: (320 GB/298 GiB)
[   22.834163] sd 0:0:0:0: [sda] Write Protect is off
[   22.834212] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   22.834241] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   22.835054] sd 2:0:0:0: [sdc] Write Protect is off
[   22.835101] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[   22.835130] sd 2:0:0:0: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[   22.845520]  sdc: sdc1 sdc2 sdc3
[   22.846184] sd 2:0:0:0: [sdc] Attached SCSI disk
[   22.846305] sd 5:1:0:0: [sdf] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[   22.846356] sd 5:1:0:0: [sdf] 4096-byte physical blocks
[   22.846455] sd 5:1:0:0: [sdf] Write Protect is off
[   22.846501] sd 5:1:0:0: [sdf] Mode Sense: 00 3a 00 00
[   22.846528] sd 5:1:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   22.850755] sd 5:2:0:0: [sdg] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[   22.850849] sd 5:2:0:0: [sdg] 4096-byte physical blocks
[   22.850953] sd 5:2:0:0: [sdg] Write Protect is off
[   22.850999] sd 5:2:0:0: [sdg] Mode Sense: 00 3a 00 00
[   22.851026] sd 5:2:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   22.853444]  sda: sda1 sda2 sda3
[   22.854098]  sdb: sdb1 sdb2 sdb3
[   22.854559] sd 1:0:0:0: [sdb] Attached SCSI disk
[   22.854682] sd 5:3:0:0: [sdh] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[   22.854738] sd 5:3:0:0: [sdh] 4096-byte physical blocks
[   22.854832] sd 0:0:0:0: [sda] Attached SCSI disk
[   22.854955] sd 5:4:0:0: [sdi] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[   22.855008] sd 5:4:0:0: [sdi] 4096-byte physical blocks
[   22.855069] sd 5:3:0:0: [sdh] Write Protect is off
[   22.855118] sd 5:3:0:0: [sdh] Mode Sense: 00 3a 00 00
[   22.855174] sd 5:3:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   22.855303]  sdf: unknown partition table
[   22.855502] sd 5:4:0:0: [sdi] Write Protect is off
[   22.855554] sd 5:4:0:0: [sdi] Mode Sense: 00 3a 00 00
[   22.855666] sd 5:4:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   22.855884] sd 5:1:0:0: [sdf] Attached SCSI disk
[   22.856103] sd 5:5:0:0: [sdj] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
[   22.856154] sd 5:5:0:0: [sdj] 4096-byte physical blocks
[   22.856354]  sdd: sdd1 sdd2 sdd3
[   22.856791] sd 3:0:0:0: [sdd] Attached SCSI disk
[   22.856903] sd 5:6:0:0: [sdk] 5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
[   22.856954] sd 5:6:0:0: [sdk] 4096-byte physical blocks
[   22.857051] sd 5:6:0:0: [sdk] Write Protect is off
[   22.857097] sd 5:6:0:0: [sdk] Mode Sense: 00 3a 00 00
[   22.857123] sd 5:6:0:0: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   22.857453] sd 5:5:0:0: [sdj] Write Protect is off
[   22.857504] sd 5:5:0:0: [sdj] Mode Sense: 00 3a 00 00
[   22.857542] sd 5:5:0:0: [sdj] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   22.861699] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   22.864256] scsi7 : pata_atiixp
[   22.864427] ata7: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xff00 irq 14
[   22.864479] ata8: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xff08 irq 15
[   22.864823] QUIRK: Enable AMD PLL fix
[   22.864866] ehci-pci 0000:00:12.2: EHCI Host Controller
[   22.864921] ehci-pci 0000:00:12.2: new USB bus registered, assigned bus number 1
[   22.864979] ehci-pci 0000:00:12.2: applying AMD SB700/SB800/Hudson-2/3 EHCI dummy qh workaround
[   22.865041] ehci-pci 0000:00:12.2: debug port 1
[   22.865148] ehci-pci 0000:00:12.2: irq 17, io mem 0xfe5ff800
[   22.865331]  sdg: unknown partition table
[   22.865675] sd 5:2:0:0: [sdg] Attached SCSI disk
[   22.868644] sd 1:0:0:0: Attached scsi generic sg1 type 0
[   22.869290] sd 2:0:0:0: Attached scsi generic sg2 type 0
[   22.869628] sd 3:0:0:0: Attached scsi generic sg3 type 0
[   22.871473] sd 5:0:0:0: Attached scsi generic sg4 type 0
[   22.872143] sd 5:1:0:0: Attached scsi generic sg5 type 0
[   22.872544] sd 5:2:0:0: Attached scsi generic sg6 type 0
[   22.872889] sd 5:3:0:0: Attached scsi generic sg7 type 0
[   22.873228] sd 5:4:0:0: Attached scsi generic sg8 type 0
[   22.873605] ehci-pci 0000:00:12.2: USB 2.0 started, EHCI 1.00
[   22.873738] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[   22.873786] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   22.873836] usb usb1: Product: EHCI Host Controller
[   22.873881] usb usb1: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ehci_hcd
[   22.873927] usb usb1: SerialNumber: 0000:00:12.2
[   22.874176] hub 1-0:1.0: USB hub found
[   22.874237] hub 1-0:1.0: 5 ports detected
[   22.874799] ehci-pci 0000:00:13.2: EHCI Host Controller
[   22.874853] ehci-pci 0000:00:13.2: new USB bus registered, assigned bus number 2
[   22.874908] ehci-pci 0000:00:13.2: applying AMD SB700/SB800/Hudson-2/3 EHCI dummy qh workaround
[   22.874970] ehci-pci 0000:00:13.2: debug port 1
[   22.875060] ehci-pci 0000:00:13.2: irq 17, io mem 0xfe5ff400
[   22.875304] sd 5:5:0:0: Attached scsi generic sg9 type 0
[   22.875734] sd 5:6:0:0: Attached scsi generic sg10 type 0
[   22.885688] ehci-pci 0000:00:13.2: USB 2.0 started, EHCI 1.00
[   22.885832] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
[   22.885885] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   22.885935] usb usb2: Product: EHCI Host Controller
[   22.885981] usb usb2: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ehci_hcd
[   22.886028] usb usb2: SerialNumber: 0000:00:13.2
[   22.886291] hub 2-0:1.0: USB hub found
[   22.886350] hub 2-0:1.0: 5 ports detected
[   22.886918] ehci-pci 0000:00:16.2: EHCI Host Controller
[   22.886972] ehci-pci 0000:00:16.2: new USB bus registered, assigned bus number 3
[   22.887028] ehci-pci 0000:00:16.2: applying AMD SB700/SB800/Hudson-2/3 EHCI dummy qh workaround
[   22.887090] ehci-pci 0000:00:16.2: debug port 1
[   22.887185] ehci-pci 0000:00:16.2: irq 17, io mem 0xfe5ff000
[   22.897569] ehci-pci 0000:00:16.2: USB 2.0 started, EHCI 1.00
[   22.897730] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002
[   22.897783] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   22.897836] usb usb3: Product: EHCI Host Controller
[   22.897883] usb usb3: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ehci_hcd
[   22.897932] usb usb3: SerialNumber: 0000:00:16.2
[   22.898214] hub 3-0:1.0: USB hub found
[   22.898272] hub 3-0:1.0: 4 ports detected
[   22.898847] ohci-pci 0000:00:12.0: OHCI PCI host controller
[   22.898914] ohci-pci 0000:00:12.0: new USB bus registered, assigned bus number 4
[   22.899017] ohci-pci 0000:00:12.0: irq 18, io mem 0xfe5fe000
[   22.957744] usb usb4: New USB device found, idVendor=1d6b, idProduct=0001
[   22.957823] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   22.957874] usb usb4: Product: OHCI PCI host controller
[   22.957920] usb usb4: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ohci_hcd
[   22.957966] usb usb4: SerialNumber: 0000:00:12.0
[   22.958434] hub 4-0:1.0: USB hub found
[   22.958495] hub 4-0:1.0: 5 ports detected
[   22.959085] ohci-pci 0000:00:13.0: OHCI PCI host controller
[   22.959142] ohci-pci 0000:00:13.0: new USB bus registered, assigned bus number 5
[   22.959237] ohci-pci 0000:00:13.0: irq 18, io mem 0xfe5fd000
[   22.972654] random: nonblocking pool is initialized
[   23.018494] usb usb5: New USB device found, idVendor=1d6b, idProduct=0001
[   23.018554] usb usb5: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   23.018605] usb usb5: Product: OHCI PCI host controller
[   23.018650] usb usb5: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ohci_hcd
[   23.018698] usb usb5: SerialNumber: 0000:00:13.0
[   23.019240] hub 5-0:1.0: USB hub found
[   23.019342] hub 5-0:1.0: 5 ports detected
[   23.019877] ohci-pci 0000:00:16.0: OHCI PCI host controller
[   23.019935] ohci-pci 0000:00:16.0: new USB bus registered, assigned bus number 6
[   23.020023] ohci-pci 0000:00:16.0: irq 18, io mem 0xfe5fc000
[   23.081765] usb usb6: New USB device found, idVendor=1d6b, idProduct=0001
[   23.081821] usb usb6: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   23.081871] usb usb6: Product: OHCI PCI host controller
[   23.081917] usb usb6: Manufacturer: Linux 3.16.0-0.bpo.4-amd64 ohci_hcd
[   23.081962] usb usb6: SerialNumber: 0000:00:16.0
[   23.082344] hub 6-0:1.0: USB hub found
[   23.082408] hub 6-0:1.0: 4 ports detected
[   23.785682] usb 4-2: new low-speed USB device number 2 using ohci-pci
[   23.967768] usb 4-2: New USB device found, idVendor=04d9, idProduct=1503
[   23.967824] usb 4-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[   23.967871] usb 4-2: Product: USB Keyboard
[   23.967915] usb 4-2: Manufacturer:  
[   23.973497] hidraw: raw HID events driver (C) Jiri Kosina
[   24.003112] usbcore: registered new interface driver usbhid
[   24.003200] usbhid: USB HID core driver
[   24.006819] input:   USB Keyboard as /devices/pci0000:00/0000:00:12.0/usb4/4-2/4-2:1.0/0003:04D9:1503.0001/input/input0
[   24.006968] hid-generic 0003:04D9:1503.0001: input,hidraw0: USB HID v1.10 Keyboard [  USB Keyboard] on usb-0000:00:12.0-2/input0
[   24.011935] input:   USB Keyboard as /devices/pci0000:00/0000:00:12.0/usb4/4-2/4-2:1.1/0003:04D9:1503.0002/input/input1
[   24.012130] hid-generic 0003:04D9:1503.0002: input,hidraw1: USB HID v1.10 Device [  USB Keyboard] on usb-0000:00:12.0-2/input1
[   24.233688] usb 4-5: new full-speed USB device number 3 using ohci-pci
[   24.400767] usb 4-5: New USB device found, idVendor=067b, idProduct=2303
[   24.400821] usb 4-5: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[   24.400869] usb 4-5: Product: USB-Serial Controller
[   24.400914] usb 4-5: Manufacturer: Prolific Technology Inc.
[   30.567918]  sdh: unknown partition table
[   30.568200]  sdj: unknown partition table
[   30.568393]  sdi: unknown partition table
[   30.569631] sd 5:4:0:0: [sdi] Attached SCSI disk
[   30.570218]  sde: unknown partition table
[   30.570481]  sdk: unknown partition table
[   30.571689] sd 5:5:0:0: [sdj] Attached SCSI disk
[   30.571875] sd 5:6:0:0: [sdk] Attached SCSI disk
[   30.572689] sd 5:3:0:0: [sdh] Attached SCSI disk
[   30.572778] sd 5:0:0:0: [sde] Attached SCSI disk

What’s my btrfs doing? And how do I recover from it?

I’ve been experimenting with btrfs on my home file server for a while. Yes, I know it’s not particularly bleeding edge or anything any more but I’m quite conservative even for just my household’s stuff as restoring from backup would be quite tedious.

Briefly, the btrfs volume was initially across four 2TB disks in RAID10 for data and metadata. At a later date I also added a 500G disk but had never rebalanced so that had no data on it.

$ sudo btrfs filesystem show
Label: 'tank'  uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
        Total devices 5 FS bytes used 1.08TB
        devid    1 size 1.82TB used 555.03GB path /dev/sdh
        devid    3 size 1.82TB used 555.03GB path /dev/sdi
        devid    4 size 1.82TB used 555.03GB path /dev/sdj
        devid    5 size 465.76GB used 0.00 path /dev/sdk
        devid    2 size 1.82TB used 555.03GB path /dev/sdg
 
Btrfs v0.20-rc1-358-g194aa4a
$ sudo btrfs filesystem df /srv/tank
Data, RAID10: total=1.08TB, used=1.08TB
System, RAID10: total=64.00MB, used=128.00KB
System: total=4.00MB, used=0.00
Metadata, RAID10: total=2.52GB, used=1.34GB

Yesterday, one of the disks started misbehaving:

Aug  7 12:17:32 specialbrew kernel: [5392685.363089] ata5.00: failed to read SCR 1 (Emask=0x40)
Aug  7 12:17:32 specialbrew kernel: [5392685.369272] ata5.01: failed to read SCR 1 (Emask=0x40)
Aug  7 12:17:32 specialbrew kernel: [5392685.375651] ata5.02: failed to read SCR 1 (Emask=0x40)
Aug  7 12:17:32 specialbrew kernel: [5392685.381796] ata5.03: failed to read SCR 1 (Emask=0x40)
Aug  7 12:17:32 specialbrew kernel: [5392685.388082] ata5.04: failed to read SCR 1 (Emask=0x40)
Aug  7 12:17:32 specialbrew kernel: [5392685.394213] ata5.05: failed to read SCR 1 (Emask=0x40)
Aug  7 12:17:32 specialbrew kernel: [5392685.400213] ata5.15: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen
Aug  7 12:17:32 specialbrew kernel: [5392685.406556] ata5.15: irq_stat 0x00060002, PMP DMA CS errata
Aug  7 12:17:32 specialbrew kernel: [5392685.412787] ata5.00: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen
Aug  7 12:17:32 specialbrew kernel: [5392685.419286] ata5.00: failed command: WRITE DMA
Aug  7 12:17:32 specialbrew kernel: [5392685.425504] ata5.00: cmd ca/00:08:56:06:a1/00:00:00:00:00/e0 tag 1 dma 4096 out
Aug  7 12:17:32 specialbrew kernel: [5392685.425504]          res 9a/d7:00:00:00:00/00:00:00:10:9a/00 Emask 0x2 (HSM violation)
Aug  7 12:17:32 specialbrew kernel: [5392685.438350] ata5.00: status: { Busy }
Aug  7 12:17:32 specialbrew kernel: [5392685.444592] ata5.00: error: { ICRC UNC IDNF ABRT }
Aug  7 12:17:32 specialbrew kernel: [5392685.451016] ata5.01: exception Emask 0x100 SAct 0x0 SErr 0x0 action 0x6 frozen
Aug  7 12:17:32 specialbrew kernel: [5392685.457334] ata5.01: failed command: WRITE DMA
Aug  7 12:17:32 specialbrew kernel: [5392685.463784] ata5.01: cmd ca/00:18:de:67:9c/00:00:00:00:00/e0 tag 0 dma 12288 out
Aug  7 12:17:32 specialbrew kernel: [5392685.463784]          res 9a/d7:00:00:00:00/00:00:00:00:9a/00 Emask 0x2 (HSM violation)
.
.
(lots more of that)
.
.
Aug  7 12:17:53 specialbrew kernel: [5392706.325072] btrfs: bdev /dev/sdh errs: wr 9, rd 0, flush 0, corrupt 0, gen 0
Aug  7 12:17:53 specialbrew kernel: [5392706.325228] btrfs: bdev /dev/sdh errs: wr 10, rd 0, flush 0, corrupt 0, gen 0
Aug  7 12:17:53 specialbrew kernel: [5392706.339976] sd 4:3:0:0: [sdh] Stopping disk
Aug  7 12:17:53 specialbrew kernel: [5392706.346436] sd 4:3:0:0: [sdh] START_STOP FAILED
Aug  7 12:17:53 specialbrew kernel: [5392706.352944] sd 4:3:0:0: [sdh]  
Aug  7 12:17:53 specialbrew kernel: [5392706.356489] end_request: I/O error, dev sdh, sector 0
Aug  7 12:17:53 specialbrew kernel: [5392706.365413] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Aug  7 12:17:53 specialbrew kernel: [5392706.475838] lost page write due to I/O error on /dev/sdh
Aug  7 12:17:53 specialbrew kernel: [5392706.482266] lost page write due to I/O error on /dev/sdh
Aug  7 12:17:53 specialbrew kernel: [5392706.488496] lost page write due to I/O error on /dev/sdh

After that point, /dev/sdh no longer existed on the system.

Okay, so then I told btrfs to forget about that device:

$ sudo btrfs device delete missing /srv/tank
$ sudo btrfs filesystem show
Label: 'tank'  uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
        Total devices 5 FS bytes used 1.08TB
        devid    3 size 1.82TB used 555.03GB path /dev/sdi
        devid    4 size 1.82TB used 555.03GB path /dev/sdj
        devid    5 size 465.76GB used 0.00 path /dev/sdk
        devid    2 size 1.82TB used 555.03GB path /dev/sdg
        *** Some devices missing
 
Btrfs v0.20-rc1-358-g194aa4a

Apart from the obvious fact that a device was then missing, things seemed happier at this point. I decided to pull the disk and re-insert it to see if it still gave errors (it’s in a hot swap chassis). After plugging the disk back in it pops up as /dev/sdl and rejoins the volume:

$ sudo btrfs filesystem show
Label: 'tank'  uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
        Total devices 5 FS bytes used 1.08TB
        devid    1 size 1.82TB used 555.04GB path /dev/sdl
        devid    3 size 1.82TB used 555.03GB path /dev/sdi
        devid    4 size 1.82TB used 555.03GB path /dev/sdj
        devid    5 size 465.76GB used 0.00 path /dev/sdk
        devid    2 size 1.82TB used 555.03GB path /dev/sdg
 
Btrfs v0.20-rc1-358-g194aa4a

…but the disk is still very unhappy:

Aug  7 17:46:46 specialbrew kernel: [5412439.946138] sd 4:3:0:0: [sdl] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)
Aug  7 17:46:46 specialbrew kernel: [5412439.946142] sd 4:3:0:0: [sdl] 4096-byte physical blocks
Aug  7 17:46:46 specialbrew kernel: [5412439.946247] sd 4:3:0:0: [sdl] Write Protect is off
Aug  7 17:46:46 specialbrew kernel: [5412439.946252] sd 4:3:0:0: [sdl] Mode Sense: 00 3a 00 00
Aug  7 17:46:46 specialbrew kernel: [5412439.946294] sd 4:3:0:0: [sdl] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Aug  7 17:46:46 specialbrew kernel: [5412439.952286]  sdl: unknown partition table
Aug  7 17:46:46 specialbrew kernel: [5412439.990436] sd 4:3:0:0: [sdl] Attached SCSI disk
Aug  7 17:46:47 specialbrew kernel: [5412440.471412] btrfs: device label tank devid 1 transid 504721 /dev/sdl
Aug  7 17:47:17 specialbrew kernel: [5412470.408079] btrfs: bdev /dev/sdl errs: wr 7464, rd 0, flush 332, corrupt 0, gen 0
Aug  7 17:47:17 specialbrew kernel: [5412470.415931] lost page write due to I/O error on /dev/sdl

Okay. So by then I was prepared to accept that this disk was toast and I just wanted it gone. How to achieve this?

Given that data was still being read off this disk okay (confirmed by dd, iostat), I thought maybe the clever thing to do would be to tell btrfs to delete this disk while it was still part of the volume.

According to the documentation this would rebalance data off of the device to the other devices (still plenty of capacity available for two copies of everything even with one disk missing). That way the period of time where there was a risk of double disk failure leading to data loss would be avoided.

$ sudo btrfs device delete /dev/sdl /srv/tank

*twiddle thumbs*

Nope, still going.

Hmm, what is it doing?

$ sudo btrfs filesystem show
Label: 'tank'  uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
        Total devices 5 FS bytes used 1.08TB
        devid    1 size 1.82TB used 555.04GB path /dev/sdl
        devid    3 size 1.82TB used 556.03GB path /dev/sdi
        devid    4 size 1.82TB used 556.03GB path /dev/sdj
        devid    5 size 465.76GB used 26.00GB path /dev/sdk
        devid    2 size 1.82TB used 556.03GB path /dev/sdg

Seems that it’s written 26GB of data to sdk (previously unused), and a little to some of the others. I’ll guess that it’s using sdk to rebalance onto, and doing so at a rate of about 1GB per minute. So in around 555 minutes this should finish and sdl will be removed, and I can eject the disk and later insert a good one?

Well, it’s now quite a few hours later and sdk is now full, but the btrfs device delete still hasn’t finished, and in fact iostat believes that writes are still taking place to all disks in the volume apart from sdl:

$ sudo iostat -x -d 5 sd{g,i,j,k,l}
Linux 3.13-0.bpo.1-amd64 (specialbrew.localnet)         08/08/14        _x86_64_        (2 CPU)
 
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdg               6.50     0.89    2.49    1.60    54.30   136.42    93.31     0.43  105.19   73.77  154.12   1.63   0.67
sdk               0.00     0.79    0.00    0.89     0.02    97.93   218.89     0.08   91.43    5.69   91.79   5.70   0.51
sdj               2.26     1.10    0.79    1.38    65.45   136.39   185.57     0.19   86.94   46.38  110.20   5.17   1.12
sdi               8.27     1.34    3.39    1.21    88.11   136.39    97.55     0.60  130.79   46.89  365.87   2.72   1.25
sdl               0.24     0.00    0.01    0.00     1.00     0.00   255.37     0.00    1.40    1.40    0.00   1.08   0.00
 
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdg               0.00     0.00    0.00   87.20     0.00  4202.40    96.39     0.64    7.39    0.00    7.39   4.43  38.64
sdk               0.00     0.20    0.00  102.40     0.00  3701.60    72.30     2.40   23.38    0.00   23.38   8.63  88.40
sdj               0.00     0.00    0.00   87.20     0.00  4202.40    96.39     0.98   11.28    0.00   11.28   5.20  45.36
sdi               0.00     0.20    0.00  118.00     0.00  4200.80    71.20     1.21   10.24    0.00   10.24   4.45  52.56
sdl               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
$ sudo btrfs filesystem show
Label: 'tank'  uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
        Total devices 5 FS bytes used 1.08TB
        devid    1 size 1.82TB used 555.04GB path /dev/sdl
        devid    3 size 1.82TB used 555.29GB path /dev/sdi
        devid    4 size 1.82TB used 555.29GB path /dev/sdj
        devid    5 size 465.76GB used 465.76GB path /dev/sdk
        devid    2 size 1.82TB used 555.29GB path /dev/sdg
 
Btrfs v0.20-rc1-358-g194aa4a

Worse still, btrfs thinks it’s out of space:

$ touch foo
touch: cannot touch `foo': No space left on device
$ df -h .
Filesystem      Size  Used Avail Use% Mounted on
-               7.8T  2.2T  5.2T  30% /srv/tank/backups

So, that’s a bit alarming.

I don’t think that btrfs device delete is ever going to finish. I think what I probably should have done is just forcibly yanked sdl and then done btrfs device delete missing, and put up with the window of possible double disk failure.

But what’s done is done and now I need to recover from this.

Should I ctrl-c the btrfs device delete? If I do that and the machine is still responsive, should I then yank sdl?

I have one spare disk slot into which I could place the new disk when it arrives, without rebooting or interrupting anything. I assume that will then register as sdm and I could add it to the btrfs volume. Would the rebalancing then start using that and complete, thus allowing me to yank sdl?

Some input from anyone who’s actually been through this would be appreciated!

Update 2014-08-12 ^

It’s all okay again now. Here’s a quick summary for those who just want to know what I did:

  • Asked for some advice from Hugo, who knows a lot more about btrfs than me!
  • Found I could not ctrl-c the device delete and had to reboot.
  • Discovered I could mount the volume with -oro,degraded,recovery, i.e. read-only. It couldn’t be mounted read-write at this stage.
  • Took a complete local backup of the 1.08TiB of data via the read-only mount onto one of the new 3TB disks that had arrived on the Friday.
  • Made a bug report against the Linux kernel for the fact that mount -odegraded,recovery would go into deadlock.
  • Compiled the latest mainline kernel from source using the instructions in the Debian Linux Kernel Handbook. After booting into it mount -odegraded,recovery worked and I had a read-write volume again.
  • Compiled a new btrfs-tools.
  • Inserted one of the new 3TB disks and did a btrfs replace start /dev/sdj /dev/sdl /srv/tank in order to replace the smallest 500GB device (sdj) with the new 3TB device (sdl).
  • Once that was complete, did btrfs filesystem resize 5:max /srv/tank in order to let btrfs know to use the entirety of the device with id 5 (sdl, the new 3TB disk).
  • Did a btrfs balance start -v -dconvert=raid1,soft -mconvert=raid1,soft /srv/tank to convert everything from RAID-10 to RAID-1 so as to be more flexible in future with different-sized devices.
  • Finally btrfs device delete missing /srv/tank to return the volume to non-degraded state.
$ sudo btrfs filesystem show
Label: 'tank'  uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
        Total devices 4 FS bytes used 1.09TiB
        devid    2 size 1.82TiB used 372.03GiB path /dev/sdg
        devid    3 size 1.82TiB used 373.00GiB path /dev/sdh
        devid    4 size 1.82TiB used 372.00GiB path /dev/sdi
        devid    5 size 2.73TiB used 1.09TiB path /dev/sdl
 
Btrfs v3.14.2

A more detailed account of the escapade follows, with some chat logs between Hugo and I thrown in to help people’s web searching.

A plan is hatched ^

<grifferz> according to iostat it's writing quite a lot to all four
           disks, and doing no reading at all
<grifferz> but it is also constantly saying
<grifferz> Aug  8 06:48:28 specialbrew kernel: [5459343.262187] btrfs:
           bdev /dev/sdl errs: wr 122021062, rd 0, flush 74622, corrupt
           0, gen 0
<darkling> OK, reading further, I don't think you'll be able to ^C the
           dev delete.
<darkling> So at this point, it's probably a forcible reboot (as polite
           as you can make it, but...)
<darkling> Take the dead disk out before the OS has a chance to see it.
<grifferz> if I waited and did nothing at all until the new disk
           arrives, if I insert it and add it to the volume do you think
           it will recover?
<darkling> This is then the point at which you run into the other
           problem, which is that you've got a small disk in there with
           4 devices on a RAID-10.
<grifferz> if adding the new disk would allow the dev delete to
           complete, presumably I could then do another dev delete for
           the 500G disk
<darkling> No, dev delete is going to fall over on the corrupt sections
           of the device.
<darkling> I wouldn't recommend using it in this case (unless it's dev
           delete missing)
<grifferz> so you would suggest to reboot, yank sdl, hopefully get up
           and running with a missing device, do dev delete missing,
           insert replacement disk, rebalance?
<darkling> It's kind of a known problem. We probably need a "device
           shoot-in-the-head" for cases where the data can't be
           recovered from a device.
<darkling> Yes.
<darkling> With the small device in the array, it might pay to do the
           dev delete missing *after* adding the new disk.
<grifferz> what problems is the 500G disk going to cause me?
<grifferz> apart from this one that I am having now I suppose :)
<darkling> Well, RAID-10 requires four devices, and will write to all
           four equally.
<darkling> So the array fills up when the smallest device is full.
<darkling> (If you have 4 devices)
<darkling> Have a play with http://carfax.org.uk/btrfs-usage/ to see
           the effects.
<grifferz> is that why it now thinks it is full because I had four 2T
           disks and a 500G one and I tried to delete one of the 2T
           ones?
<darkling> Yes.
<grifferz> ah, it's a shame it couldn't warn me of that, and also a
           shame that if I added a new 2T one (which I can probably do
           today) it won't fix itself
<darkling> I generally recommend using RAID-1 rather than RAID-10 if you
           have unequal-sized disks. It behaves rather better for space
           usage.
<grifferz> I bet I can't convert RAID-10 to RAID-1 can I? :)
<darkling> Of course you can. :)
<darkling> btrfs balance start -dconvert=raid1,soft
           -mconvert=raid1,soft /
<grifferz> oh, that's handy. I saw balance had dconvert and mconvert to
           raid1 but I thought that would only be from no redundancy
<darkling> No, it's free conversion between any RAID level.
<grifferz> nice
<grifferz> well, thanks for that, at least I have some sort of plan now.
           I may be in touch again if reboot ends up with a volume that
           won't mount! :)

Disaster! ^

In which it doesn’t mount, and then it only mounts read-only.

fuuuuuuuuuuuuuuuuuuuuuu

<grifferz> oh dear, I hit a problem! after boot it won't mount:
<grifferz> # mount /srv/tank
<grifferz> Aug  8 19:05:37 specialbrew kernel: [  426.358894] BTRFS:
           device label tank devid 5 transid 798058 /dev/sdj
<grifferz> Aug  8 19:05:37 specialbrew kernel: [  426.372031] BTRFS
           info (device sdj): disk space caching is enabled
<grifferz> Aug  8 19:05:37 specialbrew kernel: [  426.379825] BTRFS:
           failed to read the system array on sdj
<grifferz> Aug  8 19:05:37 specialbrew kernel: [  426.403095] BTRFS:
           open_ctree failed
<grifferz> mount: wrong fs type, bad option, bad superblock on
           /dev/sdj,
<grifferz> googling around but it seems like quite a generic message
<darkling> Was sdj the device that failed earlier?
<grifferz> no it was sdl (which used to be sdh)
<darkling> OK.
<grifferz> # btrfs fi sh
<grifferz> Label: 'tank'  uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
<grifferz>  Total devices 5 FS bytes used 1.08TB
<grifferz>  devid    5 size 465.76GB used 465.76GB path /dev/sdj
<grifferz>  devid    3 size 1.82TB used 555.29GB path /dev/sdh
<grifferz>  devid    4 size 1.82TB used 555.29GB path /dev/sdi
<grifferz>  devid    2 size 1.82TB used 555.29GB path /dev/sdg
<grifferz>  *** Some devices missing
<grifferz> (now)
<grifferz> perhaps ask it to do it via one of the other disks as sdj
           is now the small one?
<darkling> Yeah.
<darkling> Just what I was going to suggest. :)
<grifferz> even when specifying another disk it still says "failed to
           read the system array on sdj"
<darkling> But, with that error, it's not looking very happy. :(
<darkling> What kernel was this on?
<grifferz> it was on 3.13-0 from debian wheezy backports but since I
           rebooted it booted into 3.14-0.bpo.2-amd64
<grifferz> I can try going back to 3.13-0
<darkling> 3.14's probably better to stay with.
<darkling> Just checking it wasn't something antique.
<grifferz> I could also plug that failing disk back in and remove sdj.
           it probably still has enough life to be read from
<darkling> Well, first, what does btrfs check say about the FS?
<darkling> Also try each drive, with -s1 or -s2
<grifferz> check running on sdj, hasn't immediately aborted…
<darkling> Ooh, OK, that's good.
# btrfs check /dev/sdj
Aug  8 19:13:15 specialbrew kernel: [  884.840987] BTRFS: device label tank devid 2 transid 798058 /dev/sdg
Aug  8 19:13:15 specialbrew kernel: [  885.058896] BTRFS: device label tank devid 4 transid 798058 /dev/sdi
Aug  8 19:13:15 specialbrew kernel: [  885.091042] BTRFS: device label tank devid 3 transid 798058 /dev/sdh
Aug  8 19:13:15 specialbrew kernel: [  885.097790] BTRFS: device label tank devid 5 transid 798058 /dev/sdj
Aug  8 19:13:15 specialbrew kernel: [  885.129491] BTRFS: device label tank devid 2 transid 798058 /dev/sdg
Aug  8 19:13:15 specialbrew kernel: [  885.137456] BTRFS: device label tank devid 4 transid 798058 /dev/sdi
Aug  8 19:13:15 specialbrew kernel: [  885.145731] BTRFS: device label tank devid 3 transid 798058 /dev/sdh
Aug  8 19:13:16 specialbrew kernel: [  885.151907] BTRFS: device label tank devid 5 transid 798058 /dev/sdj
warning, device 1 is missing
warning, device 1 is missing
warning devid 1 not found already
Checking filesystem on /dev/sdj
UUID: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 49947638987 bytes used err is 0
total csum bytes: 1160389912
total tree bytes: 1439944704
total fs tree bytes: 150958080
total extent tree bytes: 55762944
btree space waste bytes: 69500665
file data blocks allocated: 1570420359168
 referenced 1568123219968
Btrfs v0.20-rc1-358-g194aa4a
<grifferz> it doesn't seem to have complained. shall I give mounting
           another try, or fsck again from another disk?
<darkling> Hmm. Odd that it's complaining about the system array, then.
<darkling> That check you just did is read-only, so it won't have
           changed anything.
<grifferz> doing the fsck with another device gives identical output
<grifferz> and no, I still can't mount it
<darkling> Oooh, hang on.
<darkling> Try with -odegraded
<grifferz> # mount -odegraded /srv/tank
<grifferz> Aug  8 19:20:58 specialbrew kernel: [ 1347.388182] BTRFS:
           device label tank devid 5 transid 798058 /dev/sdj
<grifferz> Aug  8 19:20:58 specialbrew kernel: [ 1347.628728] BTRFS
           info (device sdj): allowing degraded mounts
<grifferz> Aug  8 19:20:58 specialbrew kernel: [ 1347.633978] BTRFS
           info (device sdj): disk space caching is enabled
<grifferz> Aug  8 19:20:58 specialbrew kernel: [ 1347.725065] BTRFS:
           bdev (null) errs: wr 122025014, rd 0, flush 293476, corrupt
           0, gen 0
<grifferz> Aug  8 19:20:58 specialbrew kernel: [ 1347.730473] BTRFS:
           bdev /dev/sdg errs: wr 3, rd 8, flush 0, corrupt 0, gen 0
<grifferz> prompt not returned yet
<darkling> OK, that's probably good.
<grifferz> bit worrying it says it has an error on another disk though!
<darkling> Those are cumulative over the lifetime of the FS.
<darkling> Wouldn't worry about it too much.
<grifferz> right okay, some of those happened the other day when the
           whole bus was resetting
<grifferz> prompt still not returned :(
<darkling> Bugger...
<grifferz> yeah iostat's not showing any disk activity although the
           rest of the system still works
<darkling> Anything in syslog?
<grifferz> no that was the extent of the syslog messages, except for a
           hung task warning just now but that is for the mount and
           for btrs-transactiblah
<darkling> How about -oro,recovery,degraded?
<darkling> You'll probably have to reboot first, though.
<grifferz> I can't ctrl-c that mount so should I try that in another
           window or reboot and try it?
<grifferz> probably best to reboot I suppose
<grifferz> I suspect the problem's here though:
<grifferz> Aug  8 19:26:33 specialbrew kernel: [ 1682.538282]
           [<ffffffffa02f1610>] ? open_ctree+0x20a0/0x20a0 [btrfs]
<darkling> Yeah, open_ctree is a horrible giant 1000-line function.
<darkling> Almost every mount problem shows up in there, because
           that's where it's used.
<grifferz> hey that appears to have worked!
<darkling> Cool.
<grifferz> but it doesn't say anything useful in the syslog
<grifferz> so I worry that trying it normally will still fail
<darkling> Now unmount and try the same thing without the ro option.
<darkling> Once that works, you'll have to use -odegraded to mount the
           degraded FS until the new disk arrives,
<darkling> or simply balance to RAID-1 immediately, and then balance
           again when you get the new disk.
<grifferz> that mount command hasn't returned :(
<darkling> That's -odegraded,recovery ?
<grifferz> I think I will put the new disk in and take a copy of all my
           data from the read-only mount
<grifferz> and yes that is correct
<darkling> OK, might be worth doing one or both of upgrading to 3.16
           and reporting to bugzilla.kernel.org
<darkling> You could also take a btrfs-image -c9 -t4 of the filesystem
           (while not mounted), just in case someone (josef) wants to
           look at it.

A bug report was duly filed.

A new kernel, and some success. ^

It’s been a long time since I bothered to compile a kernel. I remember it as being quite tedious. Happily the procedure is now really quite easy. It basically amounted to:

$ wget -qO - https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.16.tar.xz | xzcat | tar xvf -
$ cd linux-3.16
$ cp /boot/config-3.14-0.bpo.2-amd64 .config
$ make oldconfig
(press Return a lot)
$ make deb-pkg
# dpkg -i ../linux-image-3.16.0_3.16.0-1_amd64.deb

That procedure is documented in the Debian Linux Kernel Handbook.

I wasn’t expecting this to make any difference, but it did! When booted into this kernel I was able to do:

# mount -odegraded,recovery /srv/tank
# umount /srv/tank
# mount -odegraded /srv/tank

and end up with a read-write, working volume.

There were no interesting syslog messages.

Thankfully from this point on the volume was fully read-write, so even though a fair bit of work was still needed I could put it back in service and no further reboots were required.

<grifferz> oh, that's interesting. after a short delay, mounting
           -orecovery,degraded on 3.16 does actually work. it appears!
<darkling> \o/
<grifferz> do I need to unmount it and remount it with just -odegraded 
           now?
<darkling> Yes, that should work.
<grifferz> and then I can put the new disk in, add it to the volume,
           rebalance it, remove the small 500G disk, convert to raid-1?
<darkling> A bit faster to use btrfs dev replace to switch out the
           small device for the new one.
<darkling> Then btrfs dev resize n:max /mountpoint for the n that's
           the new device.
<darkling> Then restripe to RAID-1.
<grifferz> right, great, it's mounted with just -odegraded
<grifferz> so: 1) insert new disk, 2) "dev replace" the 500G one for
           this new device?
<darkling> Yes.
<darkling> That will leave the new device with an FS size of 500G, so
           you need to resize it.
<darkling> (Same idea as resizing the partition but not the ext* FS on
           it)
<darkling> The resize should take a few ms. :)
<grifferz> I don't seem to have a "btrfs device replace" command. do I
           need to build a new btrfs-progs?
<darkling> What --version is it?
<darkling> (Probably build a new one, yes)
<grifferz> Btrfs v0.20-rc1-358-g194aa4a
<darkling> Yeah, that's old enough you're mising some features.
<grifferz> ah, it's not "btrfs device replace" it's just "btrfs
           replace …" I've built v3.14.2 now

So that was:

$ sudo btrfs replace start /dev/sdj /dev/sdl /srv/tank

after carefully confirming that /dev/sdj really was the 500G disk and /dev/sdl really was the new 3TB disk I just inserted (several of the device names change throughout this post as disks are ejected and inserted!).

<darkling> Oh, OK. Seems like a slightly odd place to put it. :(
<darkling> The userspace tools are a bit of a mess, from a UI point of
           view.
<darkling> I'm currently struggling with several problems with btrfs
           sub list, for example.
<grifferz> heh: $ sudo btrfs replace status /srv/tank
<grifferz> 0.4% done, 0 write errs, 0 uncorr. read errs
<darkling> Look on the bright side: it's way faster than two balances.
<grifferz> won't this still leave me with a volume that it thinks has a
           device missing though?
<darkling> Yes, but if you're going to remove the small device, this is
           still probably the fastest approach.
<grifferz> after it's finished with the replace and I've done the
           resize, will a "device delete" of the small one leave it
           satisfied?
<darkling> Once the replace has finished, the small device should no
           longer be a part of the FS at all.
<grifferz> oh yeah
<grifferz> surely it should be happy at that point then, with 4 equal
           sized devices?
<darkling> You might want to run wipefs or a shitload of /dev/zero
           with dd over it, just to make sure. (Bonus points for doing
           it from orbit. ;) )
<darkling> The replace is a byte-for-byte replacement of the device.
<darkling> So if you were degraded before that, you're degraded after
           it.
<grifferz> right but after the replace and resize then?
<darkling> The resize just tells the FS that there's more space it can
           use -- it's a trivial O(1) operation.
<grifferz> what will I need to do to make it happy that there aren't
           any missing devices then?
<darkling> An ordinary balance. (Or a balance with -dconvert=raid1 if
           you want to go that way)
<grifferz> I do ultimately. In which case do you think there is any
           reason to do the balances separately?
<darkling> No reason at all.
<grifferz> righto :)

The replace finishes:

Started on 11.Aug 20:52:05, finished on 11.Aug 22:29:54, 0 write errs, 0 uncorr. read errs

It turns out wipefs wasn’t necessary; I did it with -n anyway just to see if it would find anything, but it didn’t.

Time to do the balance/convert.

<grifferz> $ sudo btrfs balance start -v -dconvert=raid1,soft
           -mconvert=raid1,soft /srv/tank
<grifferz> Dumping filters: flags 0x7, state 0x0, force is off
<grifferz>   DATA (flags 0x300): converting, target=16, soft is on
<grifferz>   METADATA (flags 0x300): converting, target=16, soft is on
<grifferz>   SYSTEM (flags 0x300): converting, target=16, soft is on
<grifferz> fingers crossed :)
<grifferz> I am a bit concerned that syslog is mentioning sdj which is
           no longer part of the volume (it was the smallest disk)
<grifferz> Aug 11 22:45:23 specialbrew kernel: [10551.595830] BTRFS
           info (device sdj): found 18 extents
<grifferz> for example
<grifferz> and btrfs fi sh confirms that sdj is not there any more
<grifferz> well I think it is just confused because iostat does not
           think it's touching sdj any more
<grifferz> hah, balance/convert complete, but:
<grifferz> $ sudo btrfs fi sh
<grifferz> Label: 'tank'  uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
<grifferz>         Total devices 5 FS bytes used 1.09TiB
<grifferz>         devid    2 size 1.82TiB used 372.03GiB path /dev/sdg
<grifferz>         devid    3 size 1.82TiB used 373.00GiB path /dev/sdh
<grifferz>         devid    4 size 1.82TiB used 372.00GiB path /dev/sdi
<grifferz>         devid    5 size 2.73TiB used 1.09TiB path /dev/sdl
<grifferz>         *** Some devices missing
<grifferz> Btrfs v3.14.2
<grifferz> so now half my data is on sdl, the rest is split between
           three, and it still thinks something is missing!
<darkling> btrfs dev del missing /mountpoint
<grifferz> aha!
<darkling> And the way that the allocator works is to keep the amount
           of free space as even as possible -- that maximises the
           usage of the FS.
<grifferz> that was it :)
$ sudo btrfs filesystem show
Label: 'tank'  uuid: 472ee2b3-4dc3-4fc1-80bc-5ba967069ceb
        Total devices 4 FS bytes used 1.09TiB
        devid    2 size 1.82TiB used 372.03GiB path /dev/sdg
        devid    3 size 1.82TiB used 373.00GiB path /dev/sdh
        devid    4 size 1.82TiB used 372.00GiB path /dev/sdi
        devid    5 size 2.73TiB used 1.09TiB path /dev/sdl
 
Btrfs v3.14.2

Phew!

everything went better than expected