pvaneynd: (Default)
pvaneynd ([personal profile] pvaneynd) wrote2014-07-04 08:38 am
Entry tags:

A tale of four disks

A had a long string of problems with our server at home... It all started when a disk in my RAID 10 array failed, in the morning I went and noticed that the system had hung.
After a reload I saw:
Jun 30 01:52:42 frost kernel: [53418.854028] ata7: link is slow to respond, please be patient (ready=0)
Jun 30 01:52:46 frost kernel: [53423.502944] ata7: COMRESET failed (errno=-16)
Jun 30 01:52:46 frost kernel: [53423.503326] ata7: hard resetting link

Thinking: It's ok, I got RAID 10 just for this: to survive the loss of a disk I waited for the filesystem to mount... and waited ... and waited...
Getting access it seems that mount was blocked in the kernel:
Jun 30 12:24:24 frost kernel: [26939.464996] mount           D ffff88041ed42150     0  6693   6203 0x00000004
Jun 30 12:24:24 frost kernel: [26939.465554]  ffff88041ed42150 0000000000000082 0000000000012d00 ffff88041ed42150
Jun 30 12:24:24 frost kernel: [26939.466121]  ffff8800a8b4bfd8 ffff880428135470 ffff88043e212d00 ffff88043e5e6ea8
Jun 30 12:24:24 frost kernel: [26939.466708]  ffff8800a8b4b8a0 ffffffff810b86fc 0000000000000002 000000000000000e
Jun 30 12:24:24 frost kernel: [26939.467276] Call Trace:
Jun 30 12:24:24 frost kernel: [26939.467838]  [] ? wait_on_page_read+0x32/0x32
Jun 30 12:24:24 frost kernel: [26939.468406]  [] ? io_schedule+0x53/0x70
Jun 30 12:24:24 frost kernel: [26939.468976]  [] ? sleep_on_page+0x5/0x8
Jun 30 12:24:24 frost kernel: [26939.469543]  [] ? __wait_on_bit+0x3e/0x70
Jun 30 12:24:24 frost kernel: [26939.470110]  [] ? wait_on_page_bit+0x80/0x8c
Jun 30 12:24:24 frost kernel: [26939.470713]  [] ? autoremove_wake_function+0x2a/0x2a
Jun 30 12:24:24 frost kernel: [26939.471285]  [] ? filemap_fdatawait_range+0xaa/0x102
Jun 30 12:24:24 frost kernel: [26939.471862]  [] ? filemap_fdatawait_range+0xaa/0x102
Jun 30 12:24:24 frost kernel: [26939.472442]  [] ? btrfs_wait_ordered_range+0x61/0xfd [btrfs]
Jun 30 12:24:24 frost kernel: [26939.473026]  [] ? __btrfs_write_out_cache+0x469/0x670 [btrfs]
Jun 30 12:24:24 frost kernel: [26939.473611]  [] ? btrfs_write_out_cache+0x7e/0xaf [btrfs]
Jun 30 12:24:24 frost kernel: [26939.474197]  [] ? btrfs_write_dirty_block_groups+0x4b2/0x4ef [btrfs]
Jun 30 12:24:24 frost kernel: [26939.474811]  [] ? commit_cowonly_roots+0x167/0x20c [btrfs]
Jun 30 12:24:24 frost kernel: [26939.475408]  [] ? btrfs_commit_transaction+0x416/0x874 [btrfs]
Jun 30 12:24:24 frost kernel: [26939.476005]  [] ? btrfs_recover_log_trees+0x2ad/0x308 [btrfs]
Jun 30 12:24:24 frost kernel: [26939.476596]  [] ? replay_one_extent+0x4a8/0x4a8 [btrfs]
Jun 30 12:24:24 frost kernel: [26939.477188]  [] ? open_ctree+0x16ca/0x1a68 [btrfs]
Jun 30 12:24:24 frost kernel: [26939.477781]  [] ? btrfs_mount+0x34f/0x6ce [btrfs]
Jun 30 12:24:24 frost kernel: [26939.478372]  [] ? cpumask_next+0x16/0x17
Jun 30 12:24:24 frost kernel: [26939.478968]  [] ? alloc_pages_current+0xc0/0xda
Jun 30 12:24:24 frost kernel: [26939.479568]  [] ? mount_fs+0x60/0x141
Jun 30 12:24:24 frost kernel: [26939.480167]  [] ? vfs_kern_mount+0x60/0xd5
Jun 30 12:24:24 frost kernel: [26939.480765]  [] ? do_mount+0x700/0x7e6
Jun 30 12:24:24 frost kernel: [26939.481356]  [] ? SyS_mount+0x7e/0xb7
Jun 30 12:24:24 frost kernel: [26939.481940]  [] ? system_call_fastpath+0x1a/0x1f

Turns out that if you have lost a disk you need to mount the btrfs filesystem with the option "-o degraded", otherwise it will just hang.
After this the filesystem mounted. So I go and buy a new disk and try to replace the broken disk with this. This fails with errors like
parent transid verify failed on 9089576292352 wanted 1335517 found 1295746

or
BTRFS error (device sda1): unable to find ref byte nr 9124686925824 parent 0 root 10028  owner 1 offset 0

With my 3.15 kernel it OOPSed, with the standard debian 3.14 kernel it sort-of-ran. I investigate the problem in more detail and after a bit of fiddling could restart the failed disk. Still I could not replace the disk, nor could I add the new disk to the broken array.
A scrub tells you that all is well, a btrfsck tells you that there are errors it cannot fix (!!). The irony is that all the data seems to be there and works. So I've got all my data on a filesystem with unfixable festering corruption...
So in the end I fixed it with the help of reliable tools like rsync and the magical:
root@frost:~# zfs list
NAME             USED  AVAIL  REFER  MOUNTPOINT
mypool          2.42T  2.04T   136K  none
mypool/Media    2.28T  2.04T  2.28T  /Media
mypool/Storage  52.9G  2.04T  51.2G  /Storage
mypool/home     92.5G  2.04T  92.5G  /home

root@frost:~# zpool status
  pool: mypool
 state: ONLINE
  scan: resilvered 2.42T in 7h45m with 0 errors on Fri Jul  4 03:08:43 2014
config:

        NAME                                          STATE     READ WRITE CKSUM
        mypool                                        ONLINE       0     0     0
          mirror-0                                    ONLINE       0     0     0
            ata-WDC_WD30EFRX-68EUZN0_WD-WMC4N2670271  ONLINE       0     0     0
            ata-WDC_WD30EFRX-68EUZN0_WD-WMC4N1783424  ONLINE       0     0     0
          mirror-1                                    ONLINE       0     0     0
            ata-WDC_WD30EFRX-68EUZN0_WD-WMC4N0839245  ONLINE       0     0     0
            ata-WDC_WD20EARX-00PASB0_WD-WCAZAE074784  ONLINE       0     0     0

errors: No known data errors

ZFS on Linux for the win!
rbarclay: (adminspotting)

[personal profile] rbarclay 2014-07-05 09:49 pm (UTC)(link)
So, you failed with one rather new, and rather untried, technology, and thus moved on to another rather new (on Linux) and rather untried (on Linux) technology.

I think I'm gonna entrust my actual keep-worthy data to mdadm+ext3/4 for another year. Or three ;)