BUG:write data to degrade raid5

LKML Archive mirror
 help / color / mirror / Atom feed

* BUG:write data to degrade raid5
@ 2010-03-19 10:29 jin zhencheng
  2010-03-19 18:20 ` Joachim Otahal
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: jin zhencheng @ 2010-03-19 10:29 UTC (permalink / raw
  To: neilb, linux-raid, linux-kernel

hi;

i use kernel is 2.6.26.2

what i do as follow:

1, I create a raid5:
mdadm -C /dev/md5 -l 5 -n 4 /dev/sda /dev/sdb /dev/sdc  /dev/sdd
--metadata=1.0 --assume-clean

2, dd if=/dev/zero of=/dev/md5 bs=1M &

write data to this raid5

3, mdadm --manage /dev/md5 -f /dev/sda

4 mdadm --manage  /dev/md5 -f /dev/sdb

if i faild 2 disks ,then the OS kernel display OOP error and kernel down


do somebody know why ?

Is MD/RAID5 bug ?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG:write data to degrade raid5
  2010-03-19 10:29 BUG:write data to degrade raid5 jin zhencheng
@ 2010-03-19 18:20 ` Joachim Otahal
       [not found]   ` <73e903671003191123h1b7e1196v336265842f1b29e5@mail.gmail.com>
  2010-03-20 15:50 ` Bill Davidsen
  2010-03-23  3:16 ` Neil Brown
  2 siblings, 1 reply; 9+ messages in thread
From: Joachim Otahal @ 2010-03-19 18:20 UTC (permalink / raw
  To: jin zhencheng; +Cc: neilb, linux-raid, linux-kernel

jin zhencheng schrieb:
> hi;
>
> i use kernel is 2.6.26.2
>
> what i do as follow:
>
> 1, I create a raid5:
> mdadm -C /dev/md5 -l 5 -n 4 /dev/sda /dev/sdb /dev/sdc  /dev/sdd
> --metadata=1.0 --assume-clean
>
> 2, dd if=/dev/zero of=/dev/md5 bs=1M&
>
> write data to this raid5
>
> 3, mdadm --manage /dev/md5 -f /dev/sda
>
> 4 mdadm --manage  /dev/md5 -f /dev/sdb
>
> if i faild 2 disks ,then the OS kernel display OOP error and kernel down
>
> do somebody know why ?
>
> Is MD/RAID5 bug ?
>    

RAID5 can only tolerate ONE drive to fail of ALL members. If you want to 
be able to fail two drives you will have to use RAID6 or RAID5 with one 
hot-spare (and give it time to rebuild before failing the second drive).
PLEASE read the documentation on raid levels, like on wikipedia.

Joachim Otahal


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG:write data to degrade raid5
       [not found]   ` <73e903671003191123h1b7e1196v336265842f1b29e5@mail.gmail.com>
@ 2010-03-19 18:26     ` Kristleifur Daðason
  2010-03-19 18:37     ` Joachim Otahal
  1 sibling, 0 replies; 9+ messages in thread
From: Kristleifur Daðason @ 2010-03-19 18:26 UTC (permalink / raw
  To: linux-raid, linux-kernel

On Fri, Mar 19, 2010 at 6:20 PM, Joachim Otahal <Jou@gmx.net> wrote:
> jin zhencheng schrieb:
>>
>> hi;
>>
>> i use kernel is 2.6.26.2
>>
>> what i do as follow:
>>
>> 1, I create a raid5:
>> mdadm -C /dev/md5 -l 5 -n 4 /dev/sda /dev/sdb /dev/sdc  /dev/sdd
>> --metadata=1.0 --assume-clean
>>
>> 2, dd if=/dev/zero of=/dev/md5 bs=1M&
>>
>> write data to this raid5
>>
>> 3, mdadm --manage /dev/md5 -f /dev/sda
>>
>> 4 mdadm --manage  /dev/md5 -f /dev/sdb
>>
>> if i faild 2 disks ,then the OS kernel display OOP error and kernel down
>>
>> do somebody know why ?
>>
>> Is MD/RAID5 bug ?
>>
>
> RAID5 can only tolerate ONE drive to fail of ALL members. If you want to be
> able to fail two drives you will have to use RAID6 or RAID5 with one
> hot-spare (and give it time to rebuild before failing the second drive).
> PLEASE read the documentation on raid levels, like on wikipedia.
>

That is true,

but should we get a kernel oops and crash if two RAID5 drives are
failed? (THAT part looks like a bug!)

Jin, can you try a newer kernel, and a newer mdadm?

-- Kristleifur

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG:write data to degrade raid5
       [not found]   ` <73e903671003191123h1b7e1196v336265842f1b29e5@mail.gmail.com>
  2010-03-19 18:26     ` Kristleifur Daðason
@ 2010-03-19 18:37     ` Joachim Otahal
  2010-03-21 10:29       ` jin zhencheng
  1 sibling, 1 reply; 9+ messages in thread
From: Joachim Otahal @ 2010-03-19 18:37 UTC (permalink / raw
  To: Kristleifur Daðason; +Cc: jin zhencheng, neilb, linux-raid, linux-kernel

Kristleifur Daðason schrieb:
> On Fri, Mar 19, 2010 at 6:20 PM, Joachim Otahal <Jou@gmx.net 
> <mailto:Jou@gmx.net>> wrote:
>
>     jin zhencheng schrieb:
>
>         hi;
>
>         i use kernel is 2.6.26.2
>
>         what i do as follow:
>
>         1, I create a raid5:
>         mdadm -C /dev/md5 -l 5 -n 4 /dev/sda /dev/sdb /dev/sdc  /dev/sdd
>         --metadata=1.0 --assume-clean
>
>         2, dd if=/dev/zero of=/dev/md5 bs=1M&
>
>         write data to this raid5
>
>         3, mdadm --manage /dev/md5 -f /dev/sda
>
>         4 mdadm --manage  /dev/md5 -f /dev/sdb
>
>         if i faild 2 disks ,then the OS kernel display OOP error and
>         kernel down
>
>         do somebody know why ?
>
>         Is MD/RAID5 bug ?
>
>
>     RAID5 can only tolerate ONE drive to fail of ALL members. If you
>     want to be able to fail two drives you will have to use RAID6 or
>     RAID5 with one hot-spare (and give it time to rebuild before
>     failing the second drive).
>     PLEASE read the documentation on raid levels, like on wikipedia.
>
>
> That is true,
>
> but should we get a kernel oops and crash if two RAID5 drives are 
> failed? (THAT part looks like a bug!)
>
> Jin, can you try a newer kernel, and a newer mdadm?
>
> -- Kristleifur
You are probably right.
My kernel version is "Debian 2.6.26-21lenny4", and I had no oopses 
during my hot-plug testing one the hardware I use md on. I think it may 
be the driver for his chips.

Jin:

Did you really use the whole drives for testing or loopback files or 
partitions on the drives? I never did my hot-plug testings with whole 
drives being in an array, only with partitions.

Joachim Otahal


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG:write data to degrade raid5
  2010-03-19 10:29 BUG:write data to degrade raid5 jin zhencheng
  2010-03-19 18:20 ` Joachim Otahal
@ 2010-03-20 15:50 ` Bill Davidsen
  2010-03-23  3:16 ` Neil Brown
  2 siblings, 0 replies; 9+ messages in thread
From: Bill Davidsen @ 2010-03-20 15:50 UTC (permalink / raw
  To: jin zhencheng; +Cc: neilb, linux-raid, linux-kernel

jin zhencheng wrote:
> hi;
>
> i use kernel is 2.6.26.2
>
> what i do as follow:
>
> 1, I create a raid5:
> mdadm -C /dev/md5 -l 5 -n 4 /dev/sda /dev/sdb /dev/sdc  /dev/sdd
> --metadata=1.0 --assume-clean
>
> 2, dd if=/dev/zero of=/dev/md5 bs=1M &
>
> write data to this raid5
>
> 3, mdadm --manage /dev/md5 -f /dev/sda
>
> 4 mdadm --manage  /dev/md5 -f /dev/sdb
>
> if i faild 2 disks ,then the OS kernel display OOP error and kernel down
>
>
> do somebody know why ?
>
> Is MD/RAID5 bug ?
>   

I would usually say that any kernel OOPS is a bug, but in this case, 
what are you running your Linux on, given that you just trashed the 
first four drives? While it's possible to run off of other drives, you 
have to make an effort to configure Linux to do so.

-- 
Bill Davidsen <davidsen@tmr.com>
  "We can't solve today's problems by using the same thinking we
   used in creating them." - Einstein


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG:write data to degrade raid5
  2010-03-19 18:37     ` Joachim Otahal
@ 2010-03-21 10:29       ` jin zhencheng
  2010-03-21 13:04         ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 9+ messages in thread
From: jin zhencheng @ 2010-03-21 10:29 UTC (permalink / raw
  To: Joachim Otahal; +Cc: Kristleifur Daðason, neilb, linux-raid, linux-kernel

hi Joachim Otahal:

ths for your test on "Debian 2.6.26-21lenny4".
if you want to see the oop ,you should always write to the raid5 ,and
pull 2 disks out.maybe you can see the error

i think no matter what i do ,even if i pull out all the disk , kernel
should not oop.


On Sat, Mar 20, 2010 at 2:37 AM, Joachim Otahal <Jou@gmx.net> wrote:
> Kristleifur Dağason schrieb:
>>
>> On Fri, Mar 19, 2010 at 6:20 PM, Joachim Otahal <Jou@gmx.net
>> <mailto:Jou@gmx.net>> wrote:
>>
>>    jin zhencheng schrieb:
>>
>>        hi;
>>
>>        i use kernel is 2.6.26.2
>>
>>        what i do as follow:
>>
>>        1, I create a raid5:
>>        mdadm -C /dev/md5 -l 5 -n 4 /dev/sda /dev/sdb /dev/sdc  /dev/sdd
>>        --metadata=1.0 --assume-clean
>>
>>        2, dd if=/dev/zero of=/dev/md5 bs=1M&
>>
>>        write data to this raid5
>>
>>        3, mdadm --manage /dev/md5 -f /dev/sda
>>
>>        4 mdadm --manage  /dev/md5 -f /dev/sdb
>>
>>        if i faild 2 disks ,then the OS kernel display OOP error and
>>        kernel down
>>
>>        do somebody know why ?
>>
>>        Is MD/RAID5 bug ?
>>
>>
>>    RAID5 can only tolerate ONE drive to fail of ALL members. If you
>>    want to be able to fail two drives you will have to use RAID6 or
>>    RAID5 with one hot-spare (and give it time to rebuild before
>>    failing the second drive).
>>    PLEASE read the documentation on raid levels, like on wikipedia.
>>
>>
>> That is true,
>>
>> but should we get a kernel oops and crash if two RAID5 drives are failed?
>> (THAT part looks like a bug!)
>>
>> Jin, can you try a newer kernel, and a newer mdadm?
>>
>> -- Kristleifur
>
> You are probably right.
> My kernel version is "Debian 2.6.26-21lenny4", and I had no oopses during my
> hot-plug testing one the hardware I use md on. I think it may be the driver
> for his chips.
>
> Jin:
>
> Did you really use the whole drives for testing or loopback files or
> partitions on the drives? I never did my hot-plug testings with whole drives
> being in an array, only with partitions.
>
> Joachim Otahal
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG:write data to degrade raid5
  2010-03-21 10:29       ` jin zhencheng
@ 2010-03-21 13:04         ` Henrique de Moraes Holschuh
  2010-03-22  1:41           ` jin zhencheng
  0 siblings, 1 reply; 9+ messages in thread
From: Henrique de Moraes Holschuh @ 2010-03-21 13:04 UTC (permalink / raw
  To: jin zhencheng
  Cc: Joachim Otahal, Kristleifur Daðason, neilb, linux-raid,
	linux-kernel

On Sun, 21 Mar 2010, jin zhencheng wrote:
> ths for your test on "Debian 2.6.26-21lenny4".
> if you want to see the oop ,you should always write to the raid5 ,and
> pull 2 disks out.maybe you can see the error
> 
> i think no matter what i do ,even if i pull out all the disk , kernel
> should not oop.

You don't have swap (file or partition, doesn't matter) on that RAID, do
you?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG:write data to degrade raid5
  2010-03-21 13:04         ` Henrique de Moraes Holschuh
@ 2010-03-22  1:41           ` jin zhencheng
  0 siblings, 0 replies; 9+ messages in thread
From: jin zhencheng @ 2010-03-22  1:41 UTC (permalink / raw
  To: Henrique de Moraes Holschuh
  Cc: Joachim Otahal, Kristleifur Daðason, neilb, linux-raid,
	linux-kernel

don't have swap on the RAID.

On Sun, Mar 21, 2010 at 9:04 PM, Henrique de Moraes Holschuh
<hmh@hmh.eng.br> wrote:
> On Sun, 21 Mar 2010, jin zhencheng wrote:
>> ths for your test on "Debian 2.6.26-21lenny4".
>> if you want to see the oop ,you should always write to the raid5 ,and
>> pull 2 disks out.maybe you can see the error
>>
>> i think no matter what i do ,even if i pull out all the disk , kernel
>> should not oop.
>
> You don't have swap (file or partition, doesn't matter) on that RAID, do
> you?
>
> --
>  "One disk to rule them all, One disk to find them. One disk to bring
>  them all and in the darkness grind them. In the Land of Redmond
>  where the shadows lie." -- The Silicon Valley Tarot
>  Henrique Holschuh
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG:write data to degrade raid5
  2010-03-19 10:29 BUG:write data to degrade raid5 jin zhencheng
  2010-03-19 18:20 ` Joachim Otahal
  2010-03-20 15:50 ` Bill Davidsen
@ 2010-03-23  3:16 ` Neil Brown
  2 siblings, 0 replies; 9+ messages in thread
From: Neil Brown @ 2010-03-23  3:16 UTC (permalink / raw
  To: jin zhencheng; +Cc: linux-raid, linux-kernel

On Fri, 19 Mar 2010 18:29:28 +0800
jin zhencheng <zhenchengjin@gmail.com> wrote:

> hi;
> 
> i use kernel is 2.6.26.2
> 
> what i do as follow:
> 
> 1, I create a raid5:
> mdadm -C /dev/md5 -l 5 -n 4 /dev/sda /dev/sdb /dev/sdc  /dev/sdd
> --metadata=1.0 --assume-clean
> 
> 2, dd if=/dev/zero of=/dev/md5 bs=1M &
> 
> write data to this raid5
> 
> 3, mdadm --manage /dev/md5 -f /dev/sda
> 
> 4 mdadm --manage  /dev/md5 -f /dev/sdb
> 
> if i faild 2 disks ,then the OS kernel display OOP error and kernel down
> 
> 
> do somebody know why ?
> 
> Is MD/RAID5 bug ?

Certainly this is a bug.
2.6.26 is quite old now - it is possible that the bug has already been fixed.
If you are able to post the oops message - possibly use a digital camera to
get a photograph - then I can probably explain what is happening and whether
it has been fixed.

Thanks,
NeilBrown


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-03-23  3:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-19 10:29 BUG:write data to degrade raid5 jin zhencheng
2010-03-19 18:20 ` Joachim Otahal
     [not found]   ` <73e903671003191123h1b7e1196v336265842f1b29e5@mail.gmail.com>
2010-03-19 18:26     ` Kristleifur Daðason
2010-03-19 18:37     ` Joachim Otahal
2010-03-21 10:29       ` jin zhencheng
2010-03-21 13:04         ` Henrique de Moraes Holschuh
2010-03-22  1:41           ` jin zhencheng
2010-03-20 15:50 ` Bill Davidsen
2010-03-23  3:16 ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).