Entry tags:
- life,
- opensource,
- storage,
- zfs
A tale of four disks
A had a long string of problems with our server at home...
It all started when a disk in my RAID 10 array failed, in the morning I went and noticed that the system had hung.
After a reload I saw:
Thinking: It's ok, I got RAID 10 just for this: to survive the loss of a disk I waited for the filesystem to mount... and waited ... and waited...
Getting access it seems that mount was blocked in the kernel:
Turns out that if you have lost a disk you need to mount the btrfs filesystem with the option "-o degraded", otherwise it will just hang.
After this the filesystem mounted. So I go and buy a new disk and try to replace the broken disk with this. This fails with errors like
or
With my 3.15 kernel it OOPSed, with the standard debian 3.14 kernel it sort-of-ran. I investigate the problem in more detail and after a bit of fiddling could restart the failed disk. Still I could not replace the disk, nor could I add the new disk to the broken array.
A scrub tells you that all is well, a btrfsck tells you that there are errors it cannot fix (!!). The irony is that all the data seems to be there and works. So I've got all my data on a filesystem with unfixable festering corruption...
So in the end I fixed it with the help of reliable tools like rsync and the magical:
ZFS on Linux for the win!
After a reload I saw:
Jun 30 01:52:42 frost kernel: [53418.854028] ata7: link is slow to respond, please be patient (ready=0) Jun 30 01:52:46 frost kernel: [53423.502944] ata7: COMRESET failed (errno=-16) Jun 30 01:52:46 frost kernel: [53423.503326] ata7: hard resetting link
Thinking: It's ok, I got RAID 10 just for this: to survive the loss of a disk I waited for the filesystem to mount... and waited ... and waited...
Getting access it seems that mount was blocked in the kernel:
Jun 30 12:24:24 frost kernel: [26939.464996] mount D ffff88041ed42150 0 6693 6203 0x00000004 Jun 30 12:24:24 frost kernel: [26939.465554] ffff88041ed42150 0000000000000082 0000000000012d00 ffff88041ed42150 Jun 30 12:24:24 frost kernel: [26939.466121] ffff8800a8b4bfd8 ffff880428135470 ffff88043e212d00 ffff88043e5e6ea8 Jun 30 12:24:24 frost kernel: [26939.466708] ffff8800a8b4b8a0 ffffffff810b86fc 0000000000000002 000000000000000e Jun 30 12:24:24 frost kernel: [26939.467276] Call Trace: Jun 30 12:24:24 frost kernel: [26939.467838] [] ? wait_on_page_read+0x32/0x32 Jun 30 12:24:24 frost kernel: [26939.468406] [ ] ? io_schedule+0x53/0x70 Jun 30 12:24:24 frost kernel: [26939.468976] [ ] ? sleep_on_page+0x5/0x8 Jun 30 12:24:24 frost kernel: [26939.469543] [ ] ? __wait_on_bit+0x3e/0x70 Jun 30 12:24:24 frost kernel: [26939.470110] [ ] ? wait_on_page_bit+0x80/0x8c Jun 30 12:24:24 frost kernel: [26939.470713] [ ] ? autoremove_wake_function+0x2a/0x2a Jun 30 12:24:24 frost kernel: [26939.471285] [ ] ? filemap_fdatawait_range+0xaa/0x102 Jun 30 12:24:24 frost kernel: [26939.471862] [ ] ? filemap_fdatawait_range+0xaa/0x102 Jun 30 12:24:24 frost kernel: [26939.472442] [ ] ? btrfs_wait_ordered_range+0x61/0xfd [btrfs] Jun 30 12:24:24 frost kernel: [26939.473026] [ ] ? __btrfs_write_out_cache+0x469/0x670 [btrfs] Jun 30 12:24:24 frost kernel: [26939.473611] [ ] ? btrfs_write_out_cache+0x7e/0xaf [btrfs] Jun 30 12:24:24 frost kernel: [26939.474197] [ ] ? btrfs_write_dirty_block_groups+0x4b2/0x4ef [btrfs] Jun 30 12:24:24 frost kernel: [26939.474811] [ ] ? commit_cowonly_roots+0x167/0x20c [btrfs] Jun 30 12:24:24 frost kernel: [26939.475408] [ ] ? btrfs_commit_transaction+0x416/0x874 [btrfs] Jun 30 12:24:24 frost kernel: [26939.476005] [ ] ? btrfs_recover_log_trees+0x2ad/0x308 [btrfs] Jun 30 12:24:24 frost kernel: [26939.476596] [ ] ? replay_one_extent+0x4a8/0x4a8 [btrfs] Jun 30 12:24:24 frost kernel: [26939.477188] [ ] ? open_ctree+0x16ca/0x1a68 [btrfs] Jun 30 12:24:24 frost kernel: [26939.477781] [ ] ? btrfs_mount+0x34f/0x6ce [btrfs] Jun 30 12:24:24 frost kernel: [26939.478372] [ ] ? cpumask_next+0x16/0x17 Jun 30 12:24:24 frost kernel: [26939.478968] [ ] ? alloc_pages_current+0xc0/0xda Jun 30 12:24:24 frost kernel: [26939.479568] [ ] ? mount_fs+0x60/0x141 Jun 30 12:24:24 frost kernel: [26939.480167] [ ] ? vfs_kern_mount+0x60/0xd5 Jun 30 12:24:24 frost kernel: [26939.480765] [ ] ? do_mount+0x700/0x7e6 Jun 30 12:24:24 frost kernel: [26939.481356] [ ] ? SyS_mount+0x7e/0xb7 Jun 30 12:24:24 frost kernel: [26939.481940] [ ] ? system_call_fastpath+0x1a/0x1f
Turns out that if you have lost a disk you need to mount the btrfs filesystem with the option "-o degraded", otherwise it will just hang.
After this the filesystem mounted. So I go and buy a new disk and try to replace the broken disk with this. This fails with errors like
parent transid verify failed on 9089576292352 wanted 1335517 found 1295746
or
BTRFS error (device sda1): unable to find ref byte nr 9124686925824 parent 0 root 10028 owner 1 offset 0
With my 3.15 kernel it OOPSed, with the standard debian 3.14 kernel it sort-of-ran. I investigate the problem in more detail and after a bit of fiddling could restart the failed disk. Still I could not replace the disk, nor could I add the new disk to the broken array.
A scrub tells you that all is well, a btrfsck tells you that there are errors it cannot fix (!!). The irony is that all the data seems to be there and works. So I've got all my data on a filesystem with unfixable festering corruption...
So in the end I fixed it with the help of reliable tools like rsync and the magical:
root@frost:~# zfs list NAME USED AVAIL REFER MOUNTPOINT mypool 2.42T 2.04T 136K none mypool/Media 2.28T 2.04T 2.28T /Media mypool/Storage 52.9G 2.04T 51.2G /Storage mypool/home 92.5G 2.04T 92.5G /home root@frost:~# zpool status pool: mypool state: ONLINE scan: resilvered 2.42T in 7h45m with 0 errors on Fri Jul 4 03:08:43 2014 config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ata-WDC_WD30EFRX-68EUZN0_WD-WMC4N2670271 ONLINE 0 0 0 ata-WDC_WD30EFRX-68EUZN0_WD-WMC4N1783424 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 ata-WDC_WD30EFRX-68EUZN0_WD-WMC4N0839245 ONLINE 0 0 0 ata-WDC_WD20EARX-00PASB0_WD-WCAZAE074784 ONLINE 0 0 0 errors: No known data errors
ZFS on Linux for the win!
no subject
I think I'm gonna entrust my actual keep-worthy data to mdadm+ext3/4 for another year. Or three ;)