From: Brian Foster <bfoster@redhat.com>
To: linux-xfs@vger.kernel.org
Subject: [BUG] generic/475 recovery failure(s)
Date: Thu, 10 Jun 2021 11:14:32 -0400 [thread overview]
Message-ID: <YMIsWJ0Cb2ot/UjG@bfoster> (raw)
Hi all,
I'm seeing what looks like at least one new generic/475 failure on
current for-next. (I've seen one related to an attr buffer that seems to
be older and harder to reproduce.). The test devices are a couple ~15GB
lvm devices formatted with mkfs defaults. I'm still trying to establish
reproducibility, but so far a failure seems fairly reliable within ~30
iterations.
The first [1] looks like log recovery failure processing an EFI. The
second variant [2] looks like it passes log recovery, but then fails the
mount in the COW extent cleanup stage due to a refcountbt problem. I've
also seen one that looks like the same free space corruption error as
[1], but triggered via the COW recovery codepath in [2], so these could
very well be related. A snippet of the dmesg output for each failed
mount is appended below.
Brian
[1]
...
XFS (dm-5): Mounting V5 Filesystem
XFS (dm-5): Starting recovery (logdev: internal)
XFS (dm-5): Internal error ltbno + ltlen > bno at line 1940 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_ag_extent+0x586/0xa00 [xfs]
CPU: 75 PID: 207978 Comm: mount Tainted: G W I 5.13.0-rc4 #64
Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 1.6.11 11/20/2018
Call Trace:
dump_stack+0x7f/0xa1
xfs_corruption_error+0x81/0x90 [xfs]
? xfs_free_ag_extent+0x586/0xa00 [xfs]
xfs_free_ag_extent+0x5ba/0xa00 [xfs]
? xfs_free_ag_extent+0x586/0xa00 [xfs]
__xfs_free_extent+0xed/0x210 [xfs]
xfs_trans_free_extent+0x55/0x180 [xfs]
xfs_efi_item_recover+0x11b/0x170 [xfs]
xlog_recover_process_intents+0xc5/0x3c0 [xfs]
? xfs_iget+0x7c0/0x10b0 [xfs]
xlog_recover_finish+0x19/0xb0 [xfs]
xfs_log_mount_finish+0x55/0x150 [xfs]
xfs_mountfs+0x552/0x960 [xfs]
xfs_fs_fill_super+0x3af/0x7d0 [xfs]
? xfs_fs_put_super+0xa0/0xa0 [xfs]
get_tree_bdev+0x17f/0x280
vfs_get_tree+0x28/0xc0
? capable+0x3a/0x60
path_mount+0x433/0xb60
__x64_sys_mount+0xe3/0x120
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f457b46e19e
Code: 48 8b 0d dd 1c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 1c 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffec1895aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffec1895c20 RCX: 00007f457b46e19e
RDX: 0000562eaa1bb8b0 RSI: 0000562eaa1bb610 RDI: 0000562eaa1ba4e0
RBP: 0000562eaa1b95c0 R08: 0000000000000000 R09: 00007f457b530a60
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000562eaa1ba4e0 R14: 0000562eaa1bb8b0 R15: 0000562eaa1b95c0
XFS (dm-5): Corruption detected. Unmount and run xfs_repair
XFS (dm-5): Internal error xfs_trans_cancel at line 955 of file fs/xfs/xfs_trans.c. Caller xfs_efi_item_recover+0x12d/0x170 [xfs]
CPU: 75 PID: 207978 Comm: mount Tainted: G W I 5.13.0-rc4 #64
Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 1.6.11 11/20/2018
Call Trace:
dump_stack+0x7f/0xa1
xfs_trans_cancel+0x1a1/0x1f0 [xfs]
xfs_efi_item_recover+0x12d/0x170 [xfs]
xlog_recover_process_intents+0xc5/0x3c0 [xfs]
? xfs_iget+0x7c0/0x10b0 [xfs]
xlog_recover_finish+0x19/0xb0 [xfs]
xfs_log_mount_finish+0x55/0x150 [xfs]
xfs_mountfs+0x552/0x960 [xfs]
xfs_fs_fill_super+0x3af/0x7d0 [xfs]
? xfs_fs_put_super+0xa0/0xa0 [xfs]
get_tree_bdev+0x17f/0x280
vfs_get_tree+0x28/0xc0
? capable+0x3a/0x60
path_mount+0x433/0xb60
__x64_sys_mount+0xe3/0x120
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f457b46e19e
Code: 48 8b 0d dd 1c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 1c 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffec1895aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffec1895c20 RCX: 00007f457b46e19e
RDX: 0000562eaa1bb8b0 RSI: 0000562eaa1bb610 RDI: 0000562eaa1ba4e0
RBP: 0000562eaa1b95c0 R08: 0000000000000000 R09: 00007f457b530a60
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000562eaa1ba4e0 R14: 0000562eaa1bb8b0 R15: 0000562eaa1b95c0
XFS (dm-5): xfs_do_force_shutdown(0x8) called from line 956 of file fs/xfs/xfs_trans.c. Return address = ffffffffc0a9aa4a
XFS (dm-5): Corruption of in-memory data detected. Shutting down filesystem
XFS (dm-5): Please unmount the filesystem and rectify the problem(s)
XFS (dm-5): Failed to recover intents
XFS (dm-5): log mount finish failed
[2]
...
XFS (dm-5): Mounting V5 Filesystem
XFS (dm-5): Starting recovery (logdev: internal)
XFS (dm-5): Ending recovery (logdev: internal)
XFS: Assertion failed: 0, file: fs/xfs/libxfs/xfs_btree.c, line: 1588
------------[ cut here ]------------
WARNING: CPU: 73 PID: 189091 at fs/xfs/xfs_message.c:112 assfail+0x25/0x28 [xfs]
Modules linked in: rfkill dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_umad iw_cm ib_ipoib intel_rapl_msr ib_cm intel_rapl_common isst_if_common mlx5_ib ib_uverbs skx_edac nfit ib_core libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mlx5_core ipmi_ssif iTCO_wdt irqbypass intel_pmc_bxt rapl iTCO_vendor_support intel_cstate intel_uncore psample mei_me tg3 acpi_ipmi mlxfw wmi_bmof i2c_i801 pcspkr pci_hyperv_intf mei lpc_ich intel_pch_thermal i2c_smbus ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter fuse zram ip_tables xfs lpfc mgag200 drm_kms_helper nvmet_fc nvmet cec nvme_fc crct10dif_pclmul drm nvme_fabrics crc32_pclmul crc32c_intel nvme_core ghash_clmulni_intel scsi_transport_fc megaraid_sas i2c_algo_bit wmi
CPU: 73 PID: 189091 Comm: mount Tainted: G W I 5.13.0-rc4 #64
Hardware name: Dell Inc. PowerEdge R740/01KPX8, BIOS 1.6.11 11/20/2018
RIP: 0010:assfail+0x25/0x28 [xfs]
Code: ff ff 0f 0b c3 0f 1f 44 00 00 41 89 c8 48 89 d1 48 89 f2 48 c7 c6 18 c9 af c0 e8 cf fa ff ff 80 3d 01 cc 0a 00 00 74 02 0f 0b <0f> 0b c3 48 8d 45 10 48 89 e2 4c 89 e6 48 89 1c 24 48 89 44 24 18
RSP: 0018:ffffb00069057b78 EFLAGS: 00010246
RAX: 00000000ffffffea RBX: ffff9186c6b55880 RCX: 0000000000000000
RDX: 00000000ffffffc0 RSI: 0000000000000000 RDI: ffffffffc0aedee4
RBP: ffffb00069057c98 R08: 0000000000000000 R09: 000000000000000a
R10: 000000000000000a R11: f000000000000000 R12: 0000000000000000
R13: 00000000ffffff8b R14: ffffb00069057c70 R15: 0000000000000001
FS: 00007ff3505eec40(0000) GS:ffff91b5bfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff350254000 CR3: 00000030f045c001 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
xfs_btree_increment+0x27a/0x3d0 [xfs]
? xfs_refcount_still_have_space+0xb0/0xb0 [xfs]
? xfs_refcount_still_have_space+0xb0/0xb0 [xfs]
xfs_btree_simple_query_range+0x133/0x1d0 [xfs]
? xfs_trans_read_buf_map+0x23f/0x5b0 [xfs]
? xfs_refcount_still_have_space+0xb0/0xb0 [xfs]
xfs_btree_query_range+0xf6/0x110 [xfs]
? kmem_cache_alloc+0x247/0x2d0
? xfs_refcountbt_init_common+0x2b/0xa0 [xfs]
xfs_refcount_recover_cow_leftovers+0x105/0x390 [xfs]
? trace_hardirqs_on+0x1b/0xd0
? lock_acquire+0x15d/0x380
xfs_reflink_recover_cow+0x43/0xa0 [xfs]
xfs_mountfs+0x5e5/0x960 [xfs]
xfs_fs_fill_super+0x3af/0x7d0 [xfs]
? xfs_fs_put_super+0xa0/0xa0 [xfs]
get_tree_bdev+0x17f/0x280
vfs_get_tree+0x28/0xc0
? capable+0x3a/0x60
path_mount+0x433/0xb60
__x64_sys_mount+0xe3/0x120
do_syscall_64+0x40/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7ff35082119e
Code: 48 8b 0d dd 1c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 1c 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffebc43ea98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffebc43ec10 RCX: 00007ff35082119e
RDX: 000055ba504c98b0 RSI: 000055ba504c9610 RDI: 000055ba504c84e0
RBP: 000055ba504c75c0 R08: 0000000000000000 R09: 00007ff3508e3a60
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000055ba504c84e0 R14: 000055ba504c98b0 R15: 000055ba504c75c0
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff9e0da3f4>] copy_process+0x754/0x1d00
softirqs last enabled at (0): [<ffffffff9e0da3f4>] copy_process+0x754/0x1d00
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 3975c06460f0a3d7 ]---
XFS (dm-5): Error -117 recovering leftover CoW allocations.
XFS (dm-5): xfs_do_force_shutdown(0x8) called from line 917 of file fs/xfs/xfs_mount.c. Return address = ffffffffc0a904e5
XFS (dm-5): Corruption of in-memory data detected. Shutting down filesystem
XFS (dm-5): Please unmount the filesystem and rectify the problem(s)
next reply other threads:[~2021-06-10 15:14 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-10 15:14 Brian Foster [this message]
2021-06-11 19:02 ` [BUG] generic/475 recovery failure(s) Brian Foster
2021-06-11 22:33 ` Dave Chinner
[not found] ` <YMdMehWQoBJC9l0W@bfoster>
2021-06-14 12:56 ` Brian Foster
2021-06-14 23:41 ` Dave Chinner
2021-06-15 4:39 ` Dave Chinner
2021-06-16 7:05 ` Dave Chinner
2021-06-16 20:33 ` Brian Foster
2021-06-16 21:05 ` Darrick J. Wong
2021-06-16 22:54 ` Dave Chinner
2021-06-17 1:28 ` Darrick J. Wong
2021-06-17 12:52 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YMIsWJ0Cb2ot/UjG@bfoster \
--to=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.