* Recover from "couldn't read tree root"?
@ 2021-06-20 20:37 Nathan Dehnel
2021-06-20 21:09 ` Chris Murphy
2021-06-20 21:19 ` Chris Murphy
0 siblings, 2 replies; 8+ messages in thread
From: Nathan Dehnel @ 2021-06-20 20:37 UTC (permalink / raw
To: Btrfs BTRFS
A machine failed to boot, so I tried to mount its root partition from
systemrescuecd, which failed:
[ 5404.240019] BTRFS info (device bcache3): disk space caching is enabled
[ 5404.240022] BTRFS info (device bcache3): has skinny extents
[ 5404.243195] BTRFS error (device bcache3): parent transid verify
failed on 3004631449600 wanted 1420882 found 1420435
[ 5404.243279] BTRFS error (device bcache3): parent transid verify
failed on 3004631449600 wanted 1420882 found 1420435
[ 5404.243362] BTRFS error (device bcache3): parent transid verify
failed on 3004631449600 wanted 1420882 found 1420435
[ 5404.243432] BTRFS error (device bcache3): parent transid verify
failed on 3004631449600 wanted 1420882 found 1420435
[ 5404.243435] BTRFS warning (device bcache3): couldn't read tree root
[ 5404.244114] BTRFS error (device bcache3): open_ctree failed
btrfs rescue super-recover -v /dev/bcache0 returned this:
parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
Ignoring transid failure
ERROR: could not setup extent tree
Failed to recover bad superblocks
uname -a:
Linux sysrescue 5.10.34-1-lts #1 SMP Sun, 02 May 2021 12:41:09 +0000
x86_64 GNU/Linux
btrfs --version:
btrfs-progs v5.10.1
btrfs fi show:
Label: none uuid: 76189222-b60d-4402-a7ff-141f057e8574
Total devices 10 FS bytes used 1.50TiB
devid 1 size 931.51GiB used 311.03GiB path /dev/bcache3
devid 2 size 931.51GiB used 311.00GiB path /dev/bcache2
devid 3 size 931.51GiB used 311.00GiB path /dev/bcache1
devid 4 size 931.51GiB used 311.00GiB path /dev/bcache0
devid 5 size 931.51GiB used 311.00GiB path /dev/bcache4
devid 6 size 931.51GiB used 311.00GiB path /dev/bcache8
devid 7 size 931.51GiB used 311.00GiB path /dev/bcache6
devid 8 size 931.51GiB used 311.03GiB path /dev/bcache9
devid 9 size 931.51GiB used 311.03GiB path /dev/bcache7
devid 10 size 931.51GiB used 311.03GiB path /dev/bcache5
Is this filesystem recoverable?
(Sorry, re-sending because I forgot to add a subject)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Recover from "couldn't read tree root"?
2021-06-20 20:37 Recover from "couldn't read tree root"? Nathan Dehnel
@ 2021-06-20 21:09 ` Chris Murphy
2021-06-20 21:31 ` Nathan Dehnel
2021-06-20 21:19 ` Chris Murphy
1 sibling, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2021-06-20 21:09 UTC (permalink / raw
To: Nathan Dehnel; +Cc: Btrfs BTRFS
On Sun, Jun 20, 2021 at 2:38 PM Nathan Dehnel <ncdehnel@gmail.com> wrote:
>
> A machine failed to boot, so I tried to mount its root partition from
> systemrescuecd, which failed:
>
> [ 5404.240019] BTRFS info (device bcache3): disk space caching is enabled
> [ 5404.240022] BTRFS info (device bcache3): has skinny extents
> [ 5404.243195] BTRFS error (device bcache3): parent transid verify
> failed on 3004631449600 wanted 1420882 found 1420435
> [ 5404.243279] BTRFS error (device bcache3): parent transid verify
> failed on 3004631449600 wanted 1420882 found 1420435
> [ 5404.243362] BTRFS error (device bcache3): parent transid verify
> failed on 3004631449600 wanted 1420882 found 1420435
> [ 5404.243432] BTRFS error (device bcache3): parent transid verify
> failed on 3004631449600 wanted 1420882 found 1420435
> [ 5404.243435] BTRFS warning (device bcache3): couldn't read tree root
> [ 5404.244114] BTRFS error (device bcache3): open_ctree failed
>
> btrfs rescue super-recover -v /dev/bcache0 returned this:
>
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> Ignoring transid failure
> ERROR: could not setup extent tree
> Failed to recover bad superblocks
>
> uname -a:
>
> Linux sysrescue 5.10.34-1-lts #1 SMP Sun, 02 May 2021 12:41:09 +0000
> x86_64 GNU/Linux
>
> btrfs --version:
>
> btrfs-progs v5.10.1
>
> btrfs fi show:
>
> Label: none uuid: 76189222-b60d-4402-a7ff-141f057e8574
> Total devices 10 FS bytes used 1.50TiB
> devid 1 size 931.51GiB used 311.03GiB path /dev/bcache3
> devid 2 size 931.51GiB used 311.00GiB path /dev/bcache2
> devid 3 size 931.51GiB used 311.00GiB path /dev/bcache1
> devid 4 size 931.51GiB used 311.00GiB path /dev/bcache0
> devid 5 size 931.51GiB used 311.00GiB path /dev/bcache4
> devid 6 size 931.51GiB used 311.00GiB path /dev/bcache8
> devid 7 size 931.51GiB used 311.00GiB path /dev/bcache6
> devid 8 size 931.51GiB used 311.03GiB path /dev/bcache9
> devid 9 size 931.51GiB used 311.03GiB path /dev/bcache7
> devid 10 size 931.51GiB used 311.03GiB path /dev/bcache5
>
> Is this filesystem recoverable?
>> (Sorry, re-sending because I forgot to add a subject)
Definitely don't write any irreversible changes, such as a repair
attempt, to anything until you understand what what wrong or it'll
make recovery harder or impossible.
Was bcache in write back or write through mode?
What's the configuration? Can you supply something like
lsblk -o NAME,FSTYPE,SIZE,FSUSE%,MOUNTPOINT,UUID,MIN-IO,SCHED,DISC-GRAN,MODEL
--
Chris Murphy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Recover from "couldn't read tree root"?
2021-06-20 20:37 Recover from "couldn't read tree root"? Nathan Dehnel
2021-06-20 21:09 ` Chris Murphy
@ 2021-06-20 21:19 ` Chris Murphy
2021-06-20 21:48 ` Nathan Dehnel
1 sibling, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2021-06-20 21:19 UTC (permalink / raw
To: Nathan Dehnel; +Cc: Btrfs BTRFS
On Sun, Jun 20, 2021 at 2:38 PM Nathan Dehnel <ncdehnel@gmail.com> wrote:
>
> A machine failed to boot, so I tried to mount its root partition from
> systemrescuecd, which failed:
>
> [ 5404.240019] BTRFS info (device bcache3): disk space caching is enabled
> [ 5404.240022] BTRFS info (device bcache3): has skinny extents
> [ 5404.243195] BTRFS error (device bcache3): parent transid verify
> failed on 3004631449600 wanted 1420882 found 1420435
> [ 5404.243279] BTRFS error (device bcache3): parent transid verify
> failed on 3004631449600 wanted 1420882 found 1420435
> [ 5404.243362] BTRFS error (device bcache3): parent transid verify
> failed on 3004631449600 wanted 1420882 found 1420435
> [ 5404.243432] BTRFS error (device bcache3): parent transid verify
> failed on 3004631449600 wanted 1420882 found 1420435
> [ 5404.243435] BTRFS warning (device bcache3): couldn't read tree root
> [ 5404.244114] BTRFS error (device bcache3): open_ctree failed
This is generally bad, and means some lower layer did something wrong,
such as getting write order incorrect, i.e. failing to properly honor
flush/fua. Recovery can be difficult and take a while.
https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#parent_transid_verify_failed
I suggest searching logs since the last time this file system was
working, because the above error is indicating a problem that's
already happened and what we need to know is what happened, if
possible. Something like this:
journalctl --since=-5d -k -o short-monotonic --no-hostname | grep
"Linux version\| ata\|bcache\|Btrfs\|BTRFS\|] hd\| scsi\| sd\| sdhci\|
mmc\| nvme\| usb\| vd"
> btrfs rescue super-recover -v /dev/bcache0 returned this:
>
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> Ignoring transid failure
> ERROR: could not setup extent tree
> Failed to recover bad superblocks
OK something is really wrong if you're not able to see a single
superblock on any of the bcache devices. Every member device has 3
super blocks, given the sizes you've provided. For there to not be a
single one is a spectacular failure as if the bcache cache device
isn't returning correct information for any of them. So I'm gonna
guess a single shared SSD, which is a single point of failure, and
it's spitting out garbage or zeros. But I'm not even close to a bcache
expert so you might want to ask bcache developers how to figure out
what state bcache is in and whether and how to safely decouple it from
the backing drives so that you can engage in recovery attempts.
If bcache mode is write through, there's a chance the backing drives
have valid btrfs metadata, and it's just that on read the SSD is
returning bogus information.
--
Chris Murphy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Recover from "couldn't read tree root"?
2021-06-20 21:09 ` Chris Murphy
@ 2021-06-20 21:31 ` Nathan Dehnel
2021-06-20 22:19 ` Chris Murphy
2021-06-20 22:53 ` Chris Murphy
0 siblings, 2 replies; 8+ messages in thread
From: Nathan Dehnel @ 2021-06-20 21:31 UTC (permalink / raw
To: Chris Murphy; +Cc: Btrfs BTRFS
>Was bcache in write back or write through mode?
Writeback.
>What's the configuration?
NAME FSTYPE SIZE FSUSE% MOUNTPOINT
UUID MIN-IO SCHED DISC-GRAN
MODEL
loop0 squashfs 655.6M 100% /run/archiso/sfs/airootfs
512 mq-deadline 0B
sda 238.5G
512 mq-deadline 512B
C300-CTFDDAC256MAG
├─sda1 2M
512 mq-deadline 512B
├─sda2 linux_raid_member 512M
325a2f12-18b8-27f7-2f81-f554a9b0fccc 512 mq-deadline 512B
│ └─md126 vfat 511.9M
EF35-0411 512 512B
└─sda3 linux_raid_member 16G
93ed641f-394b-2122-7525-b3311aaac6b8 512 mq-deadline 512B
└─md125 swap 16G
9ea84fb7-8bd7-4a0e-91fe-398790643066 1048576 512B
sdb 232.9G
512 mq-deadline 512B
Samsung_SSD_850_EVO_250GB
├─sdb1 2M
512 mq-deadline 512B
├─sdb2 linux_raid_member 512M
325a2f12-18b8-27f7-2f81-f554a9b0fccc 512 mq-deadline 512B
│ └─md126 vfat 511.9M
EF35-0411 512 512B
└─sdb3 linux_raid_member 16G
93ed641f-394b-2122-7525-b3311aaac6b8 512 mq-deadline 512B
└─md125 swap 16G
9ea84fb7-8bd7-4a0e-91fe-398790643066 1048576 512B
sdc btrfs 931.5G
12bcde5c-b3ae-4fa6-8e17-0a4b564f1ba1 512 mq-deadline 0B
WDC_WD1002FAEX-00Z3A0
└─sdc1 bcache 931.5G
f34b26ea-8229-4f3f-bdc5-29c5fe16eaae 512 mq-deadline 0B
└─bcache0 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sdd btrfs 931.5G
12bcde5c-b3ae-4fa6-8e17-0a4b564f1ba1 512 mq-deadline 0B
WDC_WD1002FAEX-00Z3A0
└─sdd1 bcache 931.5G
beb25260-1b36-473f-93c4-7ef016a62f44 512 mq-deadline 0B
└─bcache1 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sde btrfs 931.5G
12bcde5c-b3ae-4fa6-8e17-0a4b564f1ba1 4096 mq-deadline 0B
WDC_WD1003FZEX-00MK2A0
└─sde1 bcache 931.5G
21b55c83-c951-4e4f-affc-0b9bf54c8783 4096 mq-deadline 0B
└─bcache2 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sdf btrfs 931.5G
12bcde5c-b3ae-4fa6-8e17-0a4b564f1ba1 512 mq-deadline 0B
WDC_WD1002FAEX-00Z3A0
└─sdf1 bcache 931.5G
d4d2b9d6-077d-4328-b2cd-14f6db259955 512 mq-deadline 0B
└─bcache3 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sdg btrfs 931.5G
12bcde5c-b3ae-4fa6-8e17-0a4b564f1ba1 512 mq-deadline 0B
ST1000NM0011
└─sdg1 bcache 931.5G
a8513a01-c6be-4bec-b3f9-a5797225d304 512 mq-deadline 0B
└─bcache4 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sdh 931.5G
512 mq-deadline 0B
WDC_WD1002FAEX-00Z3A0
└─sdh1 bcache 931.5G
ffeacab7-ff42-453c-b012-58b119236fa5 512 mq-deadline 0B
└─bcache5 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sdi btrfs 931.5G
12bcde5c-b3ae-4fa6-8e17-0a4b564f1ba1 512 mq-deadline 0B
WDC_WD1002FAEX-00Y9A0
└─sdi1 bcache 931.5G
f3f4d706-7d73-4b48-a5b3-9802fc0de978 512 mq-deadline 0B
└─bcache6 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sdj btrfs 931.5G
12bcde5c-b3ae-4fa6-8e17-0a4b564f1ba1 4096 mq-deadline 0B
WDC_WD1003FZEX-00MK2A0
└─sdj1 bcache 931.5G
64d10dda-4ac2-44d4-941a-362ccb5ddbba 4096 mq-deadline 0B
└─bcache7 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sdk btrfs 931.5G
12bcde5c-b3ae-4fa6-8e17-0a4b564f1ba1 512 mq-deadline 0B
WDC_WD1002FAEX-00Y9A0
└─sdk1 bcache 931.5G
c3ddc718-f700-4360-82c9-7db76114e3f6 512 mq-deadline 0B
└─bcache8 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sdl btrfs 931.5G
12bcde5c-b3ae-4fa6-8e17-0a4b564f1ba1 512 mq-deadline 0B
WDC_WD1002FAEX-00Z3A0
└─sdl1 bcache 931.5G
2bf5ac80-cdf6-4c0c-9434-bcdc4626abff 512 mq-deadline 0B
└─bcache9 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
sdm iso9660 14.9G
2021-05-08-11-22-02-00 512 mq-deadline 0B
USB_2.0_FD
├─sdm1 iso9660 717M 100% /run/archiso/bootmnt
2021-05-08-11-22-02-00 512 mq-deadline 0B
└─sdm2 vfat 1.4M
0A52-44A0 512 mq-deadline 0B
nvme0n1 linux_raid_member 13.4G
4703551c-4570-b6c8-7dda-991b93b99c9a 512 none 512B
INTEL MEMPEK1W016GA
└─md127 bcache 13.4G
dfda7dc0-07a4-40bf-b5b8-e3458c181ce4 16384 512B
├─bcache0 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache1 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache2 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache3 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache4 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache5 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache6 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache7 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache8 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
└─bcache9 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
nvme1n1 linux_raid_member 13.4G
4703551c-4570-b6c8-7dda-991b93b99c9a 512 none 512B
INTEL MEMPEK1W016GA
└─md127 bcache 13.4G
dfda7dc0-07a4-40bf-b5b8-e3458c181ce4 16384 512B
├─bcache0 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache1 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache2 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache3 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache4 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache5 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache6 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache7 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
├─bcache8 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
└─bcache9 btrfs 931.5G
76189222-b60d-4402-a7ff-141f057e8574 512 512B
On Sun, Jun 20, 2021 at 9:09 PM Chris Murphy <lists@colorremedies.com> wrote:
>
> On Sun, Jun 20, 2021 at 2:38 PM Nathan Dehnel <ncdehnel@gmail.com> wrote:
> >
> > A machine failed to boot, so I tried to mount its root partition from
> > systemrescuecd, which failed:
> >
> > [ 5404.240019] BTRFS info (device bcache3): disk space caching is enabled
> > [ 5404.240022] BTRFS info (device bcache3): has skinny extents
> > [ 5404.243195] BTRFS error (device bcache3): parent transid verify
> > failed on 3004631449600 wanted 1420882 found 1420435
> > [ 5404.243279] BTRFS error (device bcache3): parent transid verify
> > failed on 3004631449600 wanted 1420882 found 1420435
> > [ 5404.243362] BTRFS error (device bcache3): parent transid verify
> > failed on 3004631449600 wanted 1420882 found 1420435
> > [ 5404.243432] BTRFS error (device bcache3): parent transid verify
> > failed on 3004631449600 wanted 1420882 found 1420435
> > [ 5404.243435] BTRFS warning (device bcache3): couldn't read tree root
> > [ 5404.244114] BTRFS error (device bcache3): open_ctree failed
> >
> > btrfs rescue super-recover -v /dev/bcache0 returned this:
> >
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > Ignoring transid failure
> > ERROR: could not setup extent tree
> > Failed to recover bad superblocks
> >
> > uname -a:
> >
> > Linux sysrescue 5.10.34-1-lts #1 SMP Sun, 02 May 2021 12:41:09 +0000
> > x86_64 GNU/Linux
> >
> > btrfs --version:
> >
> > btrfs-progs v5.10.1
> >
> > btrfs fi show:
> >
> > Label: none uuid: 76189222-b60d-4402-a7ff-141f057e8574
> > Total devices 10 FS bytes used 1.50TiB
> > devid 1 size 931.51GiB used 311.03GiB path /dev/bcache3
> > devid 2 size 931.51GiB used 311.00GiB path /dev/bcache2
> > devid 3 size 931.51GiB used 311.00GiB path /dev/bcache1
> > devid 4 size 931.51GiB used 311.00GiB path /dev/bcache0
> > devid 5 size 931.51GiB used 311.00GiB path /dev/bcache4
> > devid 6 size 931.51GiB used 311.00GiB path /dev/bcache8
> > devid 7 size 931.51GiB used 311.00GiB path /dev/bcache6
> > devid 8 size 931.51GiB used 311.03GiB path /dev/bcache9
> > devid 9 size 931.51GiB used 311.03GiB path /dev/bcache7
> > devid 10 size 931.51GiB used 311.03GiB path /dev/bcache5
> >
> > Is this filesystem recoverable?
> >> (Sorry, re-sending because I forgot to add a subject)
>
> Definitely don't write any irreversible changes, such as a repair
> attempt, to anything until you understand what what wrong or it'll
> make recovery harder or impossible.
>
> Was bcache in write back or write through mode?
>
> What's the configuration? Can you supply something like
>
> lsblk -o NAME,FSTYPE,SIZE,FSUSE%,MOUNTPOINT,UUID,MIN-IO,SCHED,DISC-GRAN,MODEL
>
>
>
> --
> Chris Murphy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Recover from "couldn't read tree root"?
2021-06-20 21:19 ` Chris Murphy
@ 2021-06-20 21:48 ` Nathan Dehnel
0 siblings, 0 replies; 8+ messages in thread
From: Nathan Dehnel @ 2021-06-20 21:48 UTC (permalink / raw
To: Chris Murphy; +Cc: Btrfs BTRFS
>I suggest searching logs since the last time this file system was
working, because the above error is indicating a problem that's
already happened and what we need to know is what happened, if
possible. Something like this:
>journalctl --since=-5d -k -o short-monotonic --no-hostname | grep
"Linux version\| ata\|bcache\|Btrfs\|BTRFS\|] hd\| scsi\| sd\| sdhci\|
mmc\| nvme\| usb\| vd"
Unfortunately I put my journal logs in a different subvolume so they
wouldn't bloat my snapshots so they weren't included in my backups.
>So I'm gonna guess a single shared SSD, which is a single point of failure, and
it's spitting out garbage or zeros.
It's 2 SSDs in mdraid RAID10.
>But I'm not even close to a bcache expert so you might want to ask bcache developers how to figure out
what state bcache is in and whether and how to safely decouple it from
the backing drives so that you can engage in recovery attempts.
They didn't respond the last couple of times I've asked a question on
their irc or mailing list.
On Sun, Jun 20, 2021 at 9:19 PM Chris Murphy <lists@colorremedies.com> wrote:
>
> On Sun, Jun 20, 2021 at 2:38 PM Nathan Dehnel <ncdehnel@gmail.com> wrote:
> >
> > A machine failed to boot, so I tried to mount its root partition from
> > systemrescuecd, which failed:
> >
> > [ 5404.240019] BTRFS info (device bcache3): disk space caching is enabled
> > [ 5404.240022] BTRFS info (device bcache3): has skinny extents
> > [ 5404.243195] BTRFS error (device bcache3): parent transid verify
> > failed on 3004631449600 wanted 1420882 found 1420435
> > [ 5404.243279] BTRFS error (device bcache3): parent transid verify
> > failed on 3004631449600 wanted 1420882 found 1420435
> > [ 5404.243362] BTRFS error (device bcache3): parent transid verify
> > failed on 3004631449600 wanted 1420882 found 1420435
> > [ 5404.243432] BTRFS error (device bcache3): parent transid verify
> > failed on 3004631449600 wanted 1420882 found 1420435
> > [ 5404.243435] BTRFS warning (device bcache3): couldn't read tree root
> > [ 5404.244114] BTRFS error (device bcache3): open_ctree failed
>
> This is generally bad, and means some lower layer did something wrong,
> such as getting write order incorrect, i.e. failing to properly honor
> flush/fua. Recovery can be difficult and take a while.
> https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#parent_transid_verify_failed
>
> I suggest searching logs since the last time this file system was
> working, because the above error is indicating a problem that's
> already happened and what we need to know is what happened, if
> possible. Something like this:
>
> journalctl --since=-5d -k -o short-monotonic --no-hostname | grep
> "Linux version\| ata\|bcache\|Btrfs\|BTRFS\|] hd\| scsi\| sd\| sdhci\|
> mmc\| nvme\| usb\| vd"
>
>
>
> > btrfs rescue super-recover -v /dev/bcache0 returned this:
> >
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > parent transid verify failed on 3004631449600 wanted 1420882 found 1420435
> > Ignoring transid failure
> > ERROR: could not setup extent tree
> > Failed to recover bad superblocks
>
> OK something is really wrong if you're not able to see a single
> superblock on any of the bcache devices. Every member device has 3
> super blocks, given the sizes you've provided. For there to not be a
> single one is a spectacular failure as if the bcache cache device
> isn't returning correct information for any of them. So I'm gonna
> guess a single shared SSD, which is a single point of failure, and
> it's spitting out garbage or zeros. But I'm not even close to a bcache
> expert so you might want to ask bcache developers how to figure out
> what state bcache is in and whether and how to safely decouple it from
> the backing drives so that you can engage in recovery attempts.
>
> If bcache mode is write through, there's a chance the backing drives
> have valid btrfs metadata, and it's just that on read the SSD is
> returning bogus information.
>
>
>
>
>
> --
> Chris Murphy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Recover from "couldn't read tree root"?
2021-06-20 21:31 ` Nathan Dehnel
@ 2021-06-20 22:19 ` Chris Murphy
2021-06-20 22:53 ` Chris Murphy
1 sibling, 0 replies; 8+ messages in thread
From: Chris Murphy @ 2021-06-20 22:19 UTC (permalink / raw
To: Nathan Dehnel; +Cc: Chris Murphy, Btrfs BTRFS
The two Intel MEMPEK1W016GA's are in raid10, but you aren't really
protected unless the drive reports a discrete read error. Only in that
case would the md driver know to use the mirror copy. While it
certainly should sooner report a read error than return zeros or
garbage, this is the situation we're in with SSDs. Is that what's
happening? *shrug* Needs more investigation. But it's at either the
bcache or mdadm level, near as I can tell.
Was there a crash or power failure while using this array by any chance?
Chris Murphy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Recover from "couldn't read tree root"?
2021-06-20 21:31 ` Nathan Dehnel
2021-06-20 22:19 ` Chris Murphy
@ 2021-06-20 22:53 ` Chris Murphy
2021-06-22 3:26 ` Nathan Dehnel
1 sibling, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2021-06-20 22:53 UTC (permalink / raw
To: Nathan Dehnel; +Cc: Chris Murphy, Btrfs BTRFS
On Sun, Jun 20, 2021 at 3:31 PM Nathan Dehnel <ncdehnel@gmail.com> wrote:
>
> >Was bcache in write back or write through mode?
> Writeback.
Ok that's bad in this configuration because it means all the writes go
to the SSD and could be there for minutes, hours, days, or longer.
That means it's even possible the current supers are only on the SSDs,
as well as other critical btrfs metadata.
My best guess now is to assume one of the drives is bad and spewing
garbage or zeros. And assemble the array degraded with just one SSD
drive, and see if you can mount. If not, then it's the other SSD you
need to assemble degraded. There's a way to set a drive manually as
faulty so it won't assemble; I also thought of using sysfs but on my
own system, /sys/block/nvme0n1/device/delete does not exist like it
does for SATA SSDs.
Next you have to wrestle with this dilemma. If you pick the bad SSD,
you don't want bcache flushing anything from it to your HDDs or it'll
just corrupt them, right? if you pick the good SSD, you actually do
want bcache to flush it all to the drives, so they're in a good state
and you can optionally decouple the SSD entirely so that you're left
with just the individual drives again.
I think you might want to use 'blockdev --setro' on all the block
devices, SSD and HDD, to prevent any changes. You might get some
complaints from bcache if it can't write to HDDs or even to the SSDs,
so that might look like you've picked the bad SSD. But the real test
is if you can mount the btrfs. Try that with 'mount -o
ro,nologreplay,usebackuproot' and if you can at least get that far and
do some basic navigation, that's probably the good SSD. If you still
get mount failure, it's probably the bad one.
If you get a successful ro mount, I'd take advantage of it and backup
anything important. Just get it out now. And then you can try it all
again with everything read write; but with the bad SSD still disabled
and md array assemble degraded with the good SSD; and see if you can
mount read-write again. You need to be read write at the block device
layer to get bcache to flush SSD state to the drives, which I think is
done by setting the mode to writethrough and then waiting until
bcache/state is clean. HDDs need to be writable but btrfs doesn't need
to be mounted for this.
The other possibility is that there some bad data on both SSDs, in
which case it fails and chances are the btrfs is toast.
--
Chris Murphy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Recover from "couldn't read tree root"?
2021-06-20 22:53 ` Chris Murphy
@ 2021-06-22 3:26 ` Nathan Dehnel
0 siblings, 0 replies; 8+ messages in thread
From: Nathan Dehnel @ 2021-06-22 3:26 UTC (permalink / raw
To: Chris Murphy; +Cc: Btrfs BTRFS
I couldn't figure out how to salvage the filesystem, so I wiped it. Oh well.
On Sun, Jun 20, 2021 at 5:53 PM Chris Murphy <lists@colorremedies.com> wrote:
>
> On Sun, Jun 20, 2021 at 3:31 PM Nathan Dehnel <ncdehnel@gmail.com> wrote:
> >
> > >Was bcache in write back or write through mode?
> > Writeback.
>
> Ok that's bad in this configuration because it means all the writes go
> to the SSD and could be there for minutes, hours, days, or longer.
> That means it's even possible the current supers are only on the SSDs,
> as well as other critical btrfs metadata.
>
> My best guess now is to assume one of the drives is bad and spewing
> garbage or zeros. And assemble the array degraded with just one SSD
> drive, and see if you can mount. If not, then it's the other SSD you
> need to assemble degraded. There's a way to set a drive manually as
> faulty so it won't assemble; I also thought of using sysfs but on my
> own system, /sys/block/nvme0n1/device/delete does not exist like it
> does for SATA SSDs.
>
> Next you have to wrestle with this dilemma. If you pick the bad SSD,
> you don't want bcache flushing anything from it to your HDDs or it'll
> just corrupt them, right? if you pick the good SSD, you actually do
> want bcache to flush it all to the drives, so they're in a good state
> and you can optionally decouple the SSD entirely so that you're left
> with just the individual drives again.
>
> I think you might want to use 'blockdev --setro' on all the block
> devices, SSD and HDD, to prevent any changes. You might get some
> complaints from bcache if it can't write to HDDs or even to the SSDs,
> so that might look like you've picked the bad SSD. But the real test
> is if you can mount the btrfs. Try that with 'mount -o
> ro,nologreplay,usebackuproot' and if you can at least get that far and
> do some basic navigation, that's probably the good SSD. If you still
> get mount failure, it's probably the bad one.
>
> If you get a successful ro mount, I'd take advantage of it and backup
> anything important. Just get it out now. And then you can try it all
> again with everything read write; but with the bad SSD still disabled
> and md array assemble degraded with the good SSD; and see if you can
> mount read-write again. You need to be read write at the block device
> layer to get bcache to flush SSD state to the drives, which I think is
> done by setting the mode to writethrough and then waiting until
> bcache/state is clean. HDDs need to be writable but btrfs doesn't need
> to be mounted for this.
>
> The other possibility is that there some bad data on both SSDs, in
> which case it fails and chances are the btrfs is toast.
>
>
> --
> Chris Murphy
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-06-22 3:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-06-20 20:37 Recover from "couldn't read tree root"? Nathan Dehnel
2021-06-20 21:09 ` Chris Murphy
2021-06-20 21:31 ` Nathan Dehnel
2021-06-20 22:19 ` Chris Murphy
2021-06-20 22:53 ` Chris Murphy
2021-06-22 3:26 ` Nathan Dehnel
2021-06-20 21:19 ` Chris Murphy
2021-06-20 21:48 ` Nathan Dehnel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).