All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Sachi King <nakato@nakato.io>, u-boot@lists.denx.de
Cc: kabel@kernel.org, wqu@suse.com, linux-btrfs@vger.kernel.org
Subject: Re: BTRFS use-after-free bug at free_extent_buffer_internal
Date: Mon, 22 Apr 2024 16:45:50 +0930	[thread overview]
Message-ID: <995dd168-8e1e-4b99-896c-ece4dc88d6e9@gmx.com> (raw)
In-Reply-To: <3281192.oiGErgHkdL@youmu>



在 2024/4/22 16:07, Sachi King 写道:
> Hi,
>
> I've hit a bug with u-boot on my BTRFS filesystem, and I'm fairly certain
> it's a bug and not a corruption issue.
>
> A bit of history on the filesystem.  It is a fairly new filesystem as it was
> being used to give me access to test a wayland application on a
> Raspberry Pi.  The filesystem was about 3 days old when I hit the bug, and
> I'm fairly certain it never had an unclean shutdown.  I have checked the
> filesystem with "btrfs check" which has found no errors.  The filesystem
> mounts on Linux and is functional.
>
>> # btrfs check --check-data-csum /dev/sda2
>> Opening filesystem to check...
>> Checking filesystem on /dev/sda2
>> UUID: 18db6211-ac36-42c1-a22f-5e15e1486e0d
>> [1/7] checking root items
>> [2/7] checking extents
>> [3/7] checking free space tree
>> [4/7] checking fs roots
>> [5/7] checking csums against data
>> [6/7] checking root refs
>> [7/7] checking quota groups skipped (not enabled on this FS)
>> found 5070573568 bytes used, no error found
>> total csum bytes: 4451620
>> total tree bytes: 370458624
>> total fs tree bytes: 353124352
>> total extent tree bytes: 10010624
>> btree space waste bytes: 62303284
>> file data blocks allocated: 6786519040
>>   referenced 6328619008

Since btrfs check reports no error, the fs must be valid.

But considering how new it is, it may be related to some new features
not properly implemented in Uboot.

Is it possible to provide the whole binary dump of the fs?

>
>
> I've made an image of the filesystem so I could reproduce the bug in an
> environment that doesn't require the physical SBC, and have reproduced
> the issue using the head of the master branch with "qemu-x86_64_defconfig".
>
> My testing qemu was produced with the following:
>> # make qemu-x86_64_defconfig
>> # cat << EOF >> .config
>> CONFIG_AUTOBOOT=y
>> CONFIG_BOOTDELAY=1
>> CONFIG_USE_BOOTCOMMAND=y
>> CONFIG_BOOTSTD_DEFAULTS=y
>> CONFIG_BOOTSTD_FULL=y
>> CONFIG_CMD_BOOTFLOW_FULL=y
>> CONFIG_BOOTCOMMAND="bootflow scan -lb"
>> CONFIG_ENV_IS_NOWHERE=y
>> CONFIG_LZ4=y
>> CONFIG_BZIP2=y
>> CONFIG_ZSTD=y
>> CONFIG_FS_BTRFS=y
>> CONFIG_CMD_BTRFS=y
>> CONFIG_GZIP=y
>> CONFIG_DEVICE_TREE_INCLUDES="bootstd.dtsi"
>> EOF
>> # make -j24
>
> bootstd.dtsi is placed at "arch/x86/dts/bootstd.dtsi" and contains:
>> / {
>>          bootstd {
>>                  compatible = "u-boot,boot-std";
>>                  filename-prefixes = "/@boot/", "/boot/", "/";
>>                  bootdev-order = "scsi";
>>                  extlinux {
>>                          compatible = "u-boot,extlinux";
>>                  };
>>          };
>> };
>
>
> The VM was run with
>> qemu-system-x86_64 -bios u-boot.rom -nographic -M q35 -action reboot=shutdown -drive file=/mnt/dbg/u-boot-debug.img
>
> The error message I recive on boot is
>> BUG at fs/btrfs/extent-io.c:629/free_extent_buffer_internal()!
>> BUG!
>> resetting ...
>
>
> I added a print statement to free_extent_buffer_internal that prints the
> start address of the extent_buffer as I'm not sure what to be looking for
> here.  This print statement is before the decrement.
>> printf("free_extent_buffer_internal: eb->start[%llx] eb->refs[%i]\n", eb->start, eb->refs);
>
> The last message before the crash reported eb->start to be "0", with 0 refs.
>> free_extent_buffer_internal: eb->start[0] eb->refs[0]
>
> The extent at 0 struck me as odd, so I tried commenting out the freeing, by
> removing the call to free_extent_buffer_final, and this resulted in bootflow
> succeeding and showing me the boot menu, which suprised me.
> I expected to see the bug reproduce itself, with refs being zero, but eb->start
> pointing somewhere valid, but I instead got a valid address with refs at 2.
>
> I'm assuming that the order free_extent_buffer_internal is called is
> deterministic, so by counting the print outputs the line that prior held
> the extent_buffer with a 0 start was replaced with:
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>
> Interestingly, as can be seen in the full logs with my included print
> messages, 249c000 is being used just before this, with a ref count of
> 2.  249c000 does appear to reach a point where it should have been freed
> in the past, before it gets used again as seen in both logs.
>
> The failing boot log:
>> U-Boot SPL 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000)
>> Trying to boot from SPI
>> Jumping to 64-bit U-Boot: Note many features are missing
>>
>>
>> U-Boot 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000)
>>
>> CPU:   QEMU Virtual CPU version 2.5+
>> DRAM:  128 MiB
>> Core:  13 devices, 13 uclasses, devicetree: separate
>> Loading Environment from nowhere... OK
>> Model: QEMU x86 (I440FX)
>> Net:   e1000: 00:00:00:00:00:00
>>
>> Error: e1000#0 No valid MAC address found.
>>        eth_initialize() No ethernet found.
>>
>>
>> Hit any key to stop autoboot:  0
>> Scanning for bootflows in all bootdevs
>> Seq  Method       State   Uclass    Part  Name                      Filename
>> ---  -----------  ------  --------  ----  ------------------------  ----------------
>> scanning bus for devices...
>> Target spinup took 0 ms.
>> SATA link 1 timeout.
>> Target spinup took 0 ms.
>> SATA link 3 timeout.
>> SATA link 4 timeout.
>> SATA link 5 timeout.
>> AHCI 0001.0000 32 slots 6 ports 1.5 Gbps 0x3f impl SATA mode
>> flags: 64bit ncq only
>> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
>>    Device 0: (0:0) Vendor: ATA Prod.: QEMU HARDDISK Rev: 2.5+
>>              Type: Hard Disk
>>              Capacity: 58680.0 MB = 57.3 GB (120176640 x 512)
>> timeout exit!
>> Scanning bootdev 'ahci_scsi.id0lun0.bootdev':
>> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[0] eb->refs[0]
>> BUG at fs/btrfs/extent-io.c:626/free_extent_buffer_internal()!

The eb[0] seems very weird.
Even for superblock reading, we got eb at 64K, and every eb should have
at least one ref at creation time.

So this indeed looks like the direct cause.
But without a full call trace (is it possible inside U-boot runtime?), I
do not have an immediate clue.

Thanks,
Qu

>> BUG!
>> resetting ...
>
> The succeeding log with the removed freeing:
>> U-Boot SPL 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:38:25 +1000)
>> Trying to boot from SPI
>> Jumping to 64-bit U-Boot: Note many features are missing
>>
>>
>> U-Boot 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:38:25 +1000)
>>
>> CPU:   QEMU Virtual CPU version 2.5+
>> DRAM:  128 MiB
>> Core:  13 devices, 13 uclasses, devicetree: separate
>> Loading Environment from nowhere... OK
>> Model: QEMU x86 (I440FX)
>> Net:   e1000: 00:00:00:00:00:00
>>
>> Error: e1000#0 No valid MAC address found.
>>        eth_initialize() No ethernet found.
>>
>>
>> Hit any key to stop autoboot:  0
>> Scanning for bootflows in all bootdevs
>> Seq  Method       State   Uclass    Part  Name                      Filename
>> ---  -----------  ------  --------  ----  ------------------------  ----------------
>> scanning bus for devices...
>> Target spinup took 0 ms.
>> SATA link 1 timeout.
>> Target spinup took 0 ms.
>> SATA link 3 timeout.
>> SATA link 4 timeout.
>> SATA link 5 timeout.
>> AHCI 0001.0000 32 slots 6 ports 1.5 Gbps 0x3f impl SATA mode
>> flags: 64bit ncq only
>> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
>>    Device 0: (0:0) Vendor: ATA Prod.: QEMU HARDDISK Rev: 2.5+
>>              Type: Hard Disk
>>              Capacity: 58680.0 MB = 57.3 GB (120176640 x 512)
>> timeout exit!
>> Scanning bootdev 'ahci_scsi.id0lun0.bootdev':
>> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1548000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[150c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[154c000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[24a4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[249c000] eb->refs[2]
>> free_extent_buffer_internal: eb->start[2490000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[6de0000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[28f4000] eb->refs[1]
>> free_extent_buffer_internal: eb->start[1544000] eb->refs[1]
>>    0  extlinux     ready   scsi         2  ahci_scsi.id0lun0.bootdev /@boot/extlinux/extlinux.conf
>> ** Booting bootflow 'ahci_scsi.id0lun0.bootdev.part_2' with extlinux
>> ------------------------------------------------------------
>> 1:      NixOS - Default
>> 2:      NixOS - Configuration 5 (2024-03-04 11:00 - 24.05.20240205.faf912b)
>> 3:      NixOS - Configuration 4 (2024-03-02 15:05 - 24.05.20240205.faf912b)
>> 4:      NixOS - Configuration 3 (2024-03-02 14:26 - 24.05.20240205.faf912b)
>> 5:      NixOS - Configuration 2 (2024-03-02 14:06 - 24.05.20240205.faf912b)
>> 6:      NixOS - Configuration 1 (1970-01-01 10:00 - 24.05.20240205.faf912b)
>> Enter choice: 1:        NixOS - Default
>> Retrieving file: /@boot/extlinux/../nixos/gq8jsgxnhq2wvsjrni3cjw1wb5540wjw-linux-6.1.63-stable_20231123-Image
>> free_extent_buffer_internal: eb->start[10000] eb->refs[1]
>
>
> I'm not sure where to go from here with the bug, so I'm hoping for some
> help in tracking it down so it can be fixed.
>
> Thanks,
> Sachi
>
>
>

  reply	other threads:[~2024-04-22  7:16 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-22  6:37 BTRFS use-after-free bug at free_extent_buffer_internal Sachi King
2024-04-22  7:15 ` Qu Wenruo [this message]
2024-04-22  8:55   ` Qu Wenruo
2024-04-23  4:53   ` Sachi King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=995dd168-8e1e-4b99-896c-ece4dc88d6e9@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=kabel@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nakato@nakato.io \
    --cc=u-boot@lists.denx.de \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.