All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* ext3_valid_block_bitmap: Invalid block bitmap in 2.6.25rc in memory
@ 2008-04-12 20:57 Andi Kleen
  2008-04-14 14:50 ` Mingming Cao
  0 siblings, 1 reply; 5+ messages in thread
From: Andi Kleen @ 2008-04-12 20:57 UTC (permalink / raw
  To: linux-ext4


FYI, a system here running various 2.6.25rc kernels (latest upto rc7-git6) 
with longer uptimes suddenly decided to fsck one of its file systems
due to an error after reboot.

The error causing this was:

kernel: EXT3-fs error (device dm-0): ext3_valid_block_bitmap: Invalid block bitmap - block_group = 285, block = 9338882

detected by the 2.6.25rc7-git6 kernel.

I don't see any ill effects from it and fsck didn't find anything wrong
so it must have been something spurious in memory only (or fsck
fails to check for this condition, but that is hard to imagine) 

The system never showed anything like this on earlier kernel versions.

-Andi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ext3_valid_block_bitmap: Invalid block bitmap in 2.6.25rc in memory
  2008-04-12 20:57 ext3_valid_block_bitmap: Invalid block bitmap in 2.6.25rc in memory Andi Kleen
@ 2008-04-14 14:50 ` Mingming Cao
  2008-04-14 23:40   ` Andreas Dilger
  0 siblings, 1 reply; 5+ messages in thread
From: Mingming Cao @ 2008-04-14 14:50 UTC (permalink / raw
  To: Andi Kleen; +Cc: linux-ext4

On Sat, 2008-04-12 at 22:57 +0200, Andi Kleen wrote:
> FYI, a system here running various 2.6.25rc kernels (latest upto rc7-git6) 
> with longer uptimes suddenly decided to fsck one of its file systems
> due to an error after reboot.
> 
> The error causing this was:
> 
> kernel: EXT3-fs error (device dm-0): ext3_valid_block_bitmap: Invalid block bitmap - block_group = 285, block = 9338882
> 
> detected by the 2.6.25rc7-git6 kernel.
> 
> I don't see any ill effects from it and fsck didn't find anything wrong
> so it must have been something spurious in memory only (or fsck
> fails to check for this condition, but that is hard to imagine) 
> 

The ext3_valid_block_bitmap() is to check whether the block or inode
bitmap block is marked as "used" in the block group bitmap, to prevent
allocating blocks from these system meta data blocks. The error messages
seems indicating that one of the block group meta data is corrupted, but
I don't why fsck doesn't catch this, Andreas?

Mingming
> The system never showed anything like this on earlier kernel versions.
> 
> -Andi
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ext3_valid_block_bitmap: Invalid block bitmap in 2.6.25rc in memory
  2008-04-14 14:50 ` Mingming Cao
@ 2008-04-14 23:40   ` Andreas Dilger
  2008-04-15  8:47     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Dilger @ 2008-04-14 23:40 UTC (permalink / raw
  To: Mingming Cao; +Cc: Andi Kleen, linux-ext4

On Apr 14, 2008  07:50 -0700, Mingming Cao wrote:
> On Sat, 2008-04-12 at 22:57 +0200, Andi Kleen wrote:
> > FYI, a system here running various 2.6.25rc kernels (latest upto rc7-git6) 
> > with longer uptimes suddenly decided to fsck one of its file systems
> > due to an error after reboot.
> > 
> > The error causing this was:
> > 
> > kernel: EXT3-fs error (device dm-0): ext3_valid_block_bitmap: Invalid block bitmap - block_group = 285, block = 9338882
> > 
> > detected by the 2.6.25rc7-git6 kernel.
> > 
> > I don't see any ill effects from it and fsck didn't find anything wrong
> > so it must have been something spurious in memory only (or fsck
> > fails to check for this condition, but that is hard to imagine) 
> 
> The ext3_valid_block_bitmap() is to check whether the block or inode
> bitmap block is marked as "used" in the block group bitmap, to prevent
> allocating blocks from these system meta data blocks.

Right.

> The error messages seems indicating that one of the block group meta
> data is corrupted, but I don't why fsck doesn't catch this, Andreas?

It might have been corrupted on read (e.g. bad cable, or bad/wrong
data read from disk the first time).

The message itself isn't very useful though.  It should report what it
thinks is wrong with the bitmap (e.g. whether block/inode bitmaps are
unallocated, which/how many itable blocks are unallocated).

> Mingming
> > The system never showed anything like this on earlier kernel versions.

This is a new check, to catch allocation bitmap corruption before it
causes the corruption to spread into the rest of the filesystem by
double-allocating blocks, etc.  Having a checksum would also be good,
but even then memory corruption can lead to a valid checksum of bad
data in memory so a validity check is still useful for such important
and rarely-read data.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ext3_valid_block_bitmap: Invalid block bitmap in 2.6.25rc in memory
  2008-04-14 23:40   ` Andreas Dilger
@ 2008-04-15  8:47     ` Aneesh Kumar K.V
  2008-04-15 10:04       ` Andreas Dilger
  0 siblings, 1 reply; 5+ messages in thread
From: Aneesh Kumar K.V @ 2008-04-15  8:47 UTC (permalink / raw
  To: Andreas Dilger; +Cc: Mingming Cao, Andi Kleen, linux-ext4

On Mon, Apr 14, 2008 at 05:40:59PM -0600, Andreas Dilger wrote:
> On Apr 14, 2008  07:50 -0700, Mingming Cao wrote:
> > On Sat, 2008-04-12 at 22:57 +0200, Andi Kleen wrote:
> > > FYI, a system here running various 2.6.25rc kernels (latest upto rc7-git6) 
> > > with longer uptimes suddenly decided to fsck one of its file systems
> > > due to an error after reboot.
> > > 
> > > The error causing this was:
> > > 
> > > kernel: EXT3-fs error (device dm-0): ext3_valid_block_bitmap: Invalid block bitmap - block_group = 285, block = 9338882
> > > 
> > > detected by the 2.6.25rc7-git6 kernel.
> > > 
> > > I don't see any ill effects from it and fsck didn't find anything wrong
> > > so it must have been something spurious in memory only (or fsck
> > > fails to check for this condition, but that is hard to imagine) 
> > 
> > The ext3_valid_block_bitmap() is to check whether the block or inode
> > bitmap block is marked as "used" in the block group bitmap, to prevent
> > allocating blocks from these system meta data blocks.
> 
> Right.
> 
> > The error messages seems indicating that one of the block group meta
> > data is corrupted, but I don't why fsck doesn't catch this, Andreas?
> 
> It might have been corrupted on read (e.g. bad cable, or bad/wrong
> data read from disk the first time).
> 
> The message itself isn't very useful though.  It should report what it
> thinks is wrong with the bitmap (e.g. whether block/inode bitmaps are
> unallocated, which/how many itable blocks are unallocated).
> 

debugfs should help to find these details right ?


-aneesh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ext3_valid_block_bitmap: Invalid block bitmap in 2.6.25rc in memory
  2008-04-15  8:47     ` Aneesh Kumar K.V
@ 2008-04-15 10:04       ` Andreas Dilger
  0 siblings, 0 replies; 5+ messages in thread
From: Andreas Dilger @ 2008-04-15 10:04 UTC (permalink / raw
  To: Aneesh Kumar K.V; +Cc: Mingming Cao, Andi Kleen, linux-ext4

On Apr 15, 2008  14:17 +0530, Aneesh Kumar K.V wrote:
> On Mon, Apr 14, 2008 at 05:40:59PM -0600, Andreas Dilger wrote:
> > On Apr 14, 2008  07:50 -0700, Mingming Cao wrote:
> > > On Sat, 2008-04-12 at 22:57 +0200, Andi Kleen wrote:
> > > > FYI, a system here running various 2.6.25rc kernels (latest upto rc7-git6) 
> > > > with longer uptimes suddenly decided to fsck one of its file systems
> > > > due to an error after reboot.
> > > > 
> > > > The error causing this was:
> > > > 
> > > > kernel: EXT3-fs error (device dm-0): ext3_valid_block_bitmap: Invalid block bitmap - block_group = 285, block = 9338882
> > > > 
> > > > detected by the 2.6.25rc7-git6 kernel.
> > > > 
> > > > I don't see any ill effects from it and fsck didn't find anything wrong
> > > > so it must have been something spurious in memory only (or fsck
> > > > fails to check for this condition, but that is hard to imagine) 
> > > 
> > > The ext3_valid_block_bitmap() is to check whether the block or inode
> > > bitmap block is marked as "used" in the block group bitmap, to prevent
> > > allocating blocks from these system meta data blocks.
> > 
> > Right.
> > 
> > > The error messages seems indicating that one of the block group meta
> > > data is corrupted, but I don't why fsck doesn't catch this, Andreas?
> > 
> > It might have been corrupted on read (e.g. bad cable, or bad/wrong
> > data read from disk the first time).
> > 
> > The message itself isn't very useful though.  It should report what it
> > thinks is wrong with the bitmap (e.g. whether block/inode bitmaps are
> > unallocated, which/how many itable blocks are unallocated).
> 
> debugfs should help to find these details right ?

It isn't always possible to run debugfs on a customer system, and the
information would be lost after a reboot or an e2fsck.  The e2fsck might
even happen automatically after an errors=panic reboot and auto e2fsck.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-04-15 10:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-12 20:57 ext3_valid_block_bitmap: Invalid block bitmap in 2.6.25rc in memory Andi Kleen
2008-04-14 14:50 ` Mingming Cao
2008-04-14 23:40   ` Andreas Dilger
2008-04-15  8:47     ` Aneesh Kumar K.V
2008-04-15 10:04       ` Andreas Dilger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.