* Problem re-shaping RAID6
@ 2010-06-13 19:15 Jérôme Poulin
2010-06-13 23:15 ` Neil Brown
0 siblings, 1 reply; 7+ messages in thread
From: Jérôme Poulin @ 2010-06-13 19:15 UTC (permalink / raw
To: linux-raid
I had problems reshaping my RAID6 down 1 disk today and found a
problem in Grow.c:
diff -udpr mdadm-3.1.2/Grow.c mdadm-3.1.2-critical-section/Grow.c
--- mdadm-3.1.2/Grow.c 2010-03-09 23:31:39.000000000 -0500
+++ mdadm-3.1.2-critical-section/Grow.c 2010-06-13 14:57:44.000000000 -0400
@@ -497,7 +497,7 @@ int Grow_reshape(char *devname, int fd,
int rv = 0;
struct supertype *st;
- int nchunk, ochunk;
+ unsigned long nchunk, ochunk;
int nlayout, olayout;
int ndisks, odisks;
int ndata, odata;
After changing this I was able to re-shape the array, it seems it was
overflowing and I had a message saying:
root@billsshack:~/mdadm-3.1.2/ > ./mdadm --grow /dev/md0
--raid-devices=6 --backup-file=/mnt/data1/md-raid6-grow-backup.bak
mdadm: Need to backup 4503599627350016K of critical section..
mdadm: /dev/md0: Something wrong - reshape aborted
Now it works:
root@billsshack:~/mdadm-3.1.2/ > ./mdadm --grow /dev/md0
--raid-devices=6 --backup-file=/mnt/data1/md-raid6-grow-backup.bak
mdadm: Need to backup 20480K of critical section..
root@billsshack:~/mdadm-3.1.2/ > cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
[faulty] [linear]
md0 : active raid6 dm-6[0] dm-8[5] dm-5[4] dm-3[3] dm-4[2] dm-7[1]
7814041600 blocks super 0.91 level 6, 1024k chunk, algorithm 2
[6/5] [UUUUUU]
[>....................] reshape = 0.0% (33792/1953510400)
finish=1926.0min speed=16896K/sec
Here's a nice output of GDB:
Breakpoint 1, Grow_reshape (devname=0x7fffffffe390 "/dev/md0",
fd=5, quiet=0,
backup_file=0x7fffffffe3b8
"/mnt/data1/md-raid6-grow-backup.bak", size=1953510400, level=65534,
layout_str=0x0, chunksize=0, raid_disks=6) at Grow.c:939
939 blocks = ochunk/512 * nchunk/512 * odata
* ndata / a;
(gdb) p ochunk/512
$9 = 2048
(gdb) p ochunk/512 * nchunk/512
$10 = -4194304
(gdb) p nchunk
$11 = 1048576
(gdb) p ochunk
$12 = 1048576
(gdb) p ochunk/512
$13 = 2048
(gdb) p nchunk/512
$14 = 2048
(gdb) p 2048*2048
$15 = 4194304
(gdb) p 2048*2048*odata
$16 = 20971520
(gdb) p 2048*2048*odata*ndata
$17 = 83886080
(gdb) p 2048*2048*odata*ndata/a
$18 = 40960
(gdb) p ochunk/512 * nchunk/512 * odata * ndata / a
$19 = 9007199254700032
(gdb) p ochunk/512 * nchunk/512
$20 = -4194304
(gdb)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem re-shaping RAID6
2010-06-13 19:15 Problem re-shaping RAID6 Jérôme Poulin
@ 2010-06-13 23:15 ` Neil Brown
2010-06-14 10:47 ` Nagilum
[not found] ` <4C1BBE6E.5090802@tmr.com>
0 siblings, 2 replies; 7+ messages in thread
From: Neil Brown @ 2010-06-13 23:15 UTC (permalink / raw
To: Jérôme Poulin; +Cc: linux-raid
On Sun, 13 Jun 2010 15:15:08 -0400
Jérôme Poulin <jeromepoulin@gmail.com> wrote:
> I had problems reshaping my RAID6 down 1 disk today and found a
> problem in Grow.c:
>
> diff -udpr mdadm-3.1.2/Grow.c mdadm-3.1.2-critical-section/Grow.c
> --- mdadm-3.1.2/Grow.c 2010-03-09 23:31:39.000000000 -0500
> +++ mdadm-3.1.2-critical-section/Grow.c 2010-06-13 14:57:44.000000000 -0400
> @@ -497,7 +497,7 @@ int Grow_reshape(char *devname, int fd,
> int rv = 0;
> struct supertype *st;
>
> - int nchunk, ochunk;
> + unsigned long nchunk, ochunk;
> int nlayout, olayout;
> int ndisks, odisks;
> int ndata, odata;
>
>
> After changing this I was able to re-shape the array, it seems it was
> overflowing and I had a message saying:
> root@billsshack:~/mdadm-3.1.2/ > ./mdadm --grow /dev/md0
> --raid-devices=6 --backup-file=/mnt/data1/md-raid6-grow-backup.bak
> mdadm: Need to backup 4503599627350016K of critical section..
> mdadm: /dev/md0: Something wrong - reshape aborted
>
> Now it works:
> root@billsshack:~/mdadm-3.1.2/ > ./mdadm --grow /dev/md0
> --raid-devices=6 --backup-file=/mnt/data1/md-raid6-grow-backup.bak
> mdadm: Need to backup 20480K of critical section..
> root@billsshack:~/mdadm-3.1.2/ > cat /proc/mdstat
> Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> [faulty] [linear]
> md0 : active raid6 dm-6[0] dm-8[5] dm-5[4] dm-3[3] dm-4[2] dm-7[1]
> 7814041600 blocks super 0.91 level 6, 1024k chunk, algorithm 2
> [6/5] [UUUUUU]
> [>....................] reshape = 0.0% (33792/1953510400)
> finish=1926.0min speed=16896K/sec
>
>
> Here's a nice output of GDB:
> Breakpoint 1, Grow_reshape (devname=0x7fffffffe390 "/dev/md0",
> fd=5, quiet=0,
> backup_file=0x7fffffffe3b8
> "/mnt/data1/md-raid6-grow-backup.bak", size=1953510400, level=65534,
> layout_str=0x0, chunksize=0, raid_disks=6) at Grow.c:939
> 939 blocks = ochunk/512 * nchunk/512 * odata
> * ndata / a;
> (gdb) p ochunk/512
> $9 = 2048
> (gdb) p ochunk/512 * nchunk/512
> $10 = -4194304
Thanks.
I fixed this bug a slightly different way
- blocks = ochunk/512 * nchunk/512 * odata * ndata / a;
+ blocks = (ochunk/512) * (nchunk/512) * odata * ndata / a;
See
http://neil.brown.name/git?p=mdadm;a=commitdiff;h=200871adf9e15d5ad985f28c349fd89c386ef48a
I guess it is time to release a 3.1.3...
Thanks,
NeilBrown
> (gdb) p nchunk
> $11 = 1048576
> (gdb) p ochunk
> $12 = 1048576
> (gdb) p ochunk/512
> $13 = 2048
> (gdb) p nchunk/512
> $14 = 2048
> (gdb) p 2048*2048
> $15 = 4194304
> (gdb) p 2048*2048*odata
> $16 = 20971520
> (gdb) p 2048*2048*odata*ndata
> $17 = 83886080
> (gdb) p 2048*2048*odata*ndata/a
> $18 = 40960
> (gdb) p ochunk/512 * nchunk/512 * odata * ndata / a
> $19 = 9007199254700032
> (gdb) p ochunk/512 * nchunk/512
> $20 = -4194304
> (gdb)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem re-shaping RAID6
2010-06-13 23:15 ` Neil Brown
@ 2010-06-14 10:47 ` Nagilum
2010-06-17 5:47 ` Neil Brown
[not found] ` <4C1BBE6E.5090802@tmr.com>
1 sibling, 1 reply; 7+ messages in thread
From: Nagilum @ 2010-06-14 10:47 UTC (permalink / raw
To: Neil Brown; +Cc: linux-raid
----- Message from neilb@suse.de ---------
Date: Mon, 14 Jun 2010 09:15:18 +1000
From: Neil Brown <neilb@suse.de>
Subject: Re: Problem re-shaping RAID6
To: Jérôme Poulin <jeromepoulin@gmail.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
> Thanks.
> I fixed this bug a slightly different way
>
> - blocks = ochunk/512 * nchunk/512 * odata * ndata / a;
> + blocks = (ochunk/512) * (nchunk/512) * odata * ndata / a;
>
>
> See
> http://neil.brown.name/git?p=mdadm;a=commitdiff;h=200871adf9e15d5ad985f28c349fd89c386ef48a
Those static numbers always make my nose wrinkle.
Don't we have the blocksize somewhere already? I'm also concerned what
happens when true 4k sectors are used..
========================================================================
# _ __ _ __ http://www.nagilum.org/ \n icq://69646724 #
# / |/ /__ ____ _(_) /_ ____ _ nagilum@nagilum.org \n +491776461165 #
# / / _ `/ _ `/ / / // / ' \ Amiga (68k/PPC): AOS/NetBSD/Linux #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/ Mac (PPC): MacOS-X / NetBSD /Linux #
# /___/ x86: FreeBSD/Linux/Solaris/Win2k ARM9: EPOC EV6 #
========================================================================
----------------------------------------------------------------
cakebox.homeunix.net - all the machine one needs..
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem re-shaping RAID6
2010-06-14 10:47 ` Nagilum
@ 2010-06-17 5:47 ` Neil Brown
2010-06-17 8:20 ` Michael Evans
2010-06-18 10:53 ` Nagilum
0 siblings, 2 replies; 7+ messages in thread
From: Neil Brown @ 2010-06-17 5:47 UTC (permalink / raw
To: Nagilum; +Cc: linux-raid
On Mon, 14 Jun 2010 12:47:23 +0200
Nagilum <nagilum@nagilum.org> wrote:
>
> ----- Message from neilb@suse.de ---------
> Date: Mon, 14 Jun 2010 09:15:18 +1000
> From: Neil Brown <neilb@suse.de>
> Subject: Re: Problem re-shaping RAID6
> To: Jérôme Poulin <jeromepoulin@gmail.com>
> Cc: linux-raid <linux-raid@vger.kernel.org>
>
>
> > Thanks.
> > I fixed this bug a slightly different way
> >
> > - blocks = ochunk/512 * nchunk/512 * odata * ndata / a;
> > + blocks = (ochunk/512) * (nchunk/512) * odata * ndata / a;
> >
> >
> > See
> > http://neil.brown.name/git?p=mdadm;a=commitdiff;h=200871adf9e15d5ad985f28c349fd89c386ef48a
>
> Those static numbers always make my nose wrinkle.
> Don't we have the blocksize somewhere already? I'm also concerned what
> happens when true 4k sectors are used..
>
A sector will always be 512 bytes to Linux, even when we have drives that can
only do IO in multiples of 8 sectors. Changing that would cause way to many
headaches.
Yes, I could possibly use a define for '512'. Some times that is
appropriate, but I thing 512 is so clearly "bytes_per_sector" it would just
add an unnecessary level of indirection.
It is a question of taste really - no right answers.
Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem re-shaping RAID6
2010-06-17 5:47 ` Neil Brown
@ 2010-06-17 8:20 ` Michael Evans
2010-06-18 10:53 ` Nagilum
1 sibling, 0 replies; 7+ messages in thread
From: Michael Evans @ 2010-06-17 8:20 UTC (permalink / raw
To: Neil Brown; +Cc: Nagilum, linux-raid
On Wed, Jun 16, 2010 at 10:47 PM, Neil Brown <neilb@suse.de> wrote:
> On Mon, 14 Jun 2010 12:47:23 +0200
> Nagilum <nagilum@nagilum.org> wrote:
>
>>
>> ----- Message from neilb@suse.de ---------
>> Date: Mon, 14 Jun 2010 09:15:18 +1000
>> From: Neil Brown <neilb@suse.de>
>> Subject: Re: Problem re-shaping RAID6
>> To: Jérôme Poulin <jeromepoulin@gmail.com>
>> Cc: linux-raid <linux-raid@vger.kernel.org>
>>
>>
>> > Thanks.
>> > I fixed this bug a slightly different way
>> >
>> > - blocks = ochunk/512 * nchunk/512 * odata * ndata / a;
>> > + blocks = (ochunk/512) * (nchunk/512) * odata * ndata / a;
>> >
>> >
>> > See
>> > http://neil.brown.name/git?p=mdadm;a=commitdiff;h=200871adf9e15d5ad985f28c349fd89c386ef48a
>>
>> Those static numbers always make my nose wrinkle.
>> Don't we have the blocksize somewhere already? I'm also concerned what
>> happens when true 4k sectors are used..
>>
>
> A sector will always be 512 bytes to Linux, even when we have drives that can
> only do IO in multiples of 8 sectors. Changing that would cause way to many
> headaches.
>
> Yes, I could possibly use a define for '512'. Some times that is
> appropriate, but I thing 512 is so clearly "bytes_per_sector" it would just
> add an unnecessary level of indirection.
> It is a question of taste really - no right answers.
>
> Thanks,
> NeilBrown
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Slightly off topic, but wouldn't an 'always' like that be a valid
reason for starting a 2.7 or 3.0 branch; that is something where major
sections of code would /have/ to be changed even if interfaces for
other sections might remain static?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem re-shaping RAID6
2010-06-17 5:47 ` Neil Brown
2010-06-17 8:20 ` Michael Evans
@ 2010-06-18 10:53 ` Nagilum
1 sibling, 0 replies; 7+ messages in thread
From: Nagilum @ 2010-06-18 10:53 UTC (permalink / raw
To: Neil Brown; +Cc: linux-raid
----- Message from neilb@suse.de ---------
> A sector will always be 512 bytes to Linux, even when we have drives that can
> only do IO in multiples of 8 sectors.
Oh, I didn't know that.
> Yes, I could possibly use a define for '512'. Some times that is
> appropriate, but I thing 512 is so clearly "bytes_per_sector" it would just
> add an unnecessary level of indirection.
> It is a question of taste really - no right answers.
Yep, thanks for the explanation!
========================================================================
# _ __ _ __ http://www.nagilum.org/ \n icq://69646724 #
# / |/ /__ ____ _(_) /_ ____ _ nagilum@nagilum.org \n +491776461165 #
# / / _ `/ _ `/ / / // / ' \ Amiga (68k/PPC): AOS/NetBSD/Linux #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/ Mac (PPC): MacOS-X / NetBSD /Linux #
# /___/ x86: FreeBSD/Linux/Solaris/Win2k ARM9: EPOC EV6 #
========================================================================
----------------------------------------------------------------
cakebox.homeunix.net - all the machine one needs..
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Problem re-shaping RAID6
[not found] ` <4C1BBE6E.5090802@tmr.com>
@ 2010-06-29 1:11 ` Neil Brown
0 siblings, 0 replies; 7+ messages in thread
From: Neil Brown @ 2010-06-29 1:11 UTC (permalink / raw
To: Bill Davidsen; +Cc: Jérôme Poulin, linux-raid
On Fri, 18 Jun 2010 14:43:58 -0400
Bill Davidsen <davidsen@tmr.com> wrote:
> Neil Brown wrote:
> > On Sun, 13 Jun 2010 15:15:08 -0400
> > Jérôme Poulin <jeromepoulin@gmail.com> wrote:
> >
> >
> >> I had problems reshaping my RAID6 down 1 disk today and found a
> >> problem in Grow.c:
> >>
> >> diff -udpr mdadm-3.1.2/Grow.c mdadm-3.1.2-critical-section/Grow.c
> >> --- mdadm-3.1.2/Grow.c 2010-03-09 23:31:39.000000000 -0500
> >> +++ mdadm-3.1.2-critical-section/Grow.c 2010-06-13 14:57:44.000000000 -0400
> >> @@ -497,7 +497,7 @@ int Grow_reshape(char *devname, int fd,
> >> int rv = 0;
> >> struct supertype *st;
> >>
> >> - int nchunk, ochunk;
> >> + unsigned long nchunk, ochunk;
> >> int nlayout, olayout;
> >> int ndisks, odisks;
> >> int ndata, odata;
> >>
> >>
> >> After changing this I was able to re-shape the array, it seems it was
> >> overflowing and I had a message saying:
> >> root@billsshack:~/mdadm-3.1.2/ > ./mdadm --grow /dev/md0
> >> --raid-devices=6 --backup-file=/mnt/data1/md-raid6-grow-backup.bak
> >> mdadm: Need to backup 4503599627350016K of critical section..
> >> mdadm: /dev/md0: Something wrong - reshape aborted
> >>
> >> Now it works:
> >> root@billsshack:~/mdadm-3.1.2/ > ./mdadm --grow /dev/md0
> >> --raid-devices=6 --backup-file=/mnt/data1/md-raid6-grow-backup.bak
> >> mdadm: Need to backup 20480K of critical section..
> >> root@billsshack:~/mdadm-3.1.2/ > cat /proc/mdstat
> >> Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> >> [faulty] [linear]
> >> md0 : active raid6 dm-6[0] dm-8[5] dm-5[4] dm-3[3] dm-4[2] dm-7[1]
> >> 7814041600 blocks super 0.91 level 6, 1024k chunk, algorithm 2
> >> [6/5] [UUUUUU]
> >> [>....................] reshape = 0.0% (33792/1953510400)
> >> finish=1926.0min speed=16896K/sec
> >>
> >>
> >> Here's a nice output of GDB:
> >> Breakpoint 1, Grow_reshape (devname=0x7fffffffe390 "/dev/md0",
> >> fd=5, quiet=0,
> >> backup_file=0x7fffffffe3b8
> >> "/mnt/data1/md-raid6-grow-backup.bak", size=1953510400, level=65534,
> >> layout_str=0x0, chunksize=0, raid_disks=6) at Grow.c:939
> >> 939 blocks = ochunk/512 * nchunk/512 * odata
> >> * ndata / a;
> >> (gdb) p ochunk/512
> >> $9 = 2048
> >> (gdb) p ochunk/512 * nchunk/512
> >> $10 = -4194304
> >>
> >
> >
> > Thanks.
> > I fixed this bug a slightly different way
> >
> > - blocks = ochunk/512 * nchunk/512 * odata * ndata / a;
> > + blocks = (ochunk/512) * (nchunk/512) * odata * ndata / a;
> >
> >
> >
> I don't have a current C standard handy, but I was on the original X3J11
> committee, and I'm fairly sure that even with that grouping, which may
> work fine for gcc, the standard allows the compiler to ignore what you
> did and do the optimization as long as the result "is the same as" doing
> what you said, in math terms rather than real world. And since other
> compilers (such as Intel C) are used to compile the kernel and may be in
> the future, and since disks are getting larger and faster, I do prefer
> the future proofing of the original solution.
Seriously? It sounds pretty broken to allow any transformation on fixed-size
integers that would valid on real numbers... (You did say "real world"...)
Even anything valid on integers is quite possibly not valid on fixed-sized
numbers.
But if I were to change it I don't think I would make the variables bigger.
I would do something like
ochunk_sectors = ochunk / 512;
nchunk_sectors = nchunk / 512;
blocks = ochunk_sectors * nchunk_sectors * odata * ndata / a;
but that just makes it clearer to me that if the guarantees there are
different to the guarantees for what I originally proposed, then there is
something seriously wrong with the language.
Do you have any reference to any standard that I could look at and try to
understand?
Thanks.
NeilBrown
>
> > See
> > http://neil.brown.name/git?p=mdadm;a=commitdiff;h=200871adf9e15d5ad985f28c349fd89c386ef48a
> >
> > I guess it is time to release a 3.1.3...
> >
> > Thanks,
> > NeilBrown
> >
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-06-29 1:11 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-13 19:15 Problem re-shaping RAID6 Jérôme Poulin
2010-06-13 23:15 ` Neil Brown
2010-06-14 10:47 ` Nagilum
2010-06-17 5:47 ` Neil Brown
2010-06-17 8:20 ` Michael Evans
2010-06-18 10:53 ` Nagilum
[not found] ` <4C1BBE6E.5090802@tmr.com>
2010-06-29 1:11 ` Neil Brown
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.