All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* Debian Squeeze raid 1 0
@ 2020-01-13 23:34 Rickard Svensson
  2020-01-14  7:23 ` Wols Lists
  0 siblings, 1 reply; 7+ messages in thread
From: Rickard Svensson @ 2020-01-13 23:34 UTC (permalink / raw
  To: linux-raid

Hi all

One disk in my raid 1 0 failed the other night.
It has been running for +8 years, on my server, a Debian Squeeze.
(And yes, I was just about to update them, bought the HD's and everything)

I thought that i would be able to backup the data, but i got ext4
error aswell, and when i tried to repair that with fsck i got:
"
# fsck -n /dev/md0
fsck.ext4: Attempt to read block from filesystem resulted in short
read while trying to open /dev/md0
Could this be a zero-length partition?
"

So i am wondering if my mdadm raid is okay.
The "State  [clean|active]" and "Array State : AA.."    is not so easy
to interpret, tried to read parts of the threads, but at the same time
is worried that more disks should failt... And I'm starting to get
really stressed :(

All the disk are the same type.  And apparently does not support SCT,
which I was not aware of before.
/dev/sde2  seems to be gone.

"
cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 sda2[0] sde2[3](F) sdc2[2](F) sdb2[1]
      5840999424 blocks super 1.2 512K chunks 2 near-copies [4/2] [UU__]
"

"
smartctl -H -i -l scterc /dev/sda
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD30EZRX-00MMMB0
Serial Number:    WD-WMAWZ0328886
Firmware Version: 80.00A80
User Capacity:    3 000 592 982 016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Mon Jan 13 23:53:27 2020 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Warning: device does not support SCT Error Recovery Control command
"


"
/dev/md0:
        Version : 1.2
  Creation Time : Tue Oct  9 23:30:23 2012
     Raid Level : raid10
     Array Size : 5840999424 (5570.41 GiB 5981.18 GB)
  Used Dev Size : 2920499712 (2785.21 GiB 2990.59 GB)
   Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Mon Jan 13 22:47:33 2020
          State : clean, FAILED
 Active Devices : 2
Working Devices : 2
 Failed Devices : 2
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : ttserv:0  (local to host ttserv)
           UUID : cb5bfe7a:3806324c:3c1e7030:e6267102
         Events : 2860

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       18        1      active sync   /dev/sdb2
       2       0        0        2      removed
       3       0        0        3      removed

       2       8       34        -      faulty spare   /dev/sdc2
       3       8       66        -      faulty spare
"

"
/dev/sda2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : cb5bfe7a:3806324c:3c1e7030:e6267102
           Name : ttserv:0  (local to host ttserv)
  Creation Time : Tue Oct  9 23:30:23 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 5840999786 (2785.21 GiB 2990.59 GB)
     Array Size : 11681998848 (5570.41 GiB 5981.18 GB)
  Used Dev Size : 5840999424 (2785.21 GiB 2990.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f474ad64:6bb236d3:9f69f55c:eb9b8c27

    Update Time : Mon Jan 13 22:47:33 2020
       Checksum : 5c2fce09 - correct
         Events : 2860

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AA.. ('A' == active, '.' == missing)
/dev/sdb2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : cb5bfe7a:3806324c:3c1e7030:e6267102
           Name : ttserv:0  (local to host ttserv)
  Creation Time : Tue Oct  9 23:30:23 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 5840999786 (2785.21 GiB 2990.59 GB)
     Array Size : 11681998848 (5570.41 GiB 5981.18 GB)
  Used Dev Size : 5840999424 (2785.21 GiB 2990.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1a67153d:8d15019a:349926d5:e22dd321

    Update Time : Mon Jan 13 22:47:33 2020
       Checksum : 6dd9e4de - correct
         Events : 2860

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AA.. ('A' == active, '.' == missing)
/dev/sdc2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : cb5bfe7a:3806324c:3c1e7030:e6267102
           Name : ttserv:0  (local to host ttserv)
  Creation Time : Tue Oct  9 23:30:23 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 5840999786 (2785.21 GiB 2990.59 GB)
     Array Size : 11681998848 (5570.41 GiB 5981.18 GB)
  Used Dev Size : 5840999424 (2785.21 GiB 2990.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 23079ae3:c67969c2:13299e27:8ca3cf7f

    Update Time : Sun Jan 12 00:11:05 2020
       Checksum : ed375eb5 - correct
         Events : 2719

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAA. ('A' == active, '.' == missing)
"


I really hope someone can help me!

/Rickard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Squeeze raid 1 0
  2020-01-13 23:34 Debian Squeeze raid 1 0 Rickard Svensson
@ 2020-01-14  7:23 ` Wols Lists
  2020-01-14  8:08   ` Gandalf Corvotempesta
  2020-01-14 23:11   ` Rickard Svensson
  0 siblings, 2 replies; 7+ messages in thread
From: Wols Lists @ 2020-01-14  7:23 UTC (permalink / raw
  To: Rickard Svensson, linux-raid

On 13/01/20 23:34, Rickard Svensson wrote:
> Hi all
> 
> One disk in my raid 1 0 failed the other night.
> It has been running for +8 years, on my server, a Debian Squeeze.

8 years old, debian squeeze, what version of mdadm is that.

> (And yes, I was just about to update them, bought the HD's and everything)
> 
Great. First things first, ddrescue all drives on to the new ones! I
think recovering your data won't be too hard, so might as well back up
the data on to your new drives then recover that.

> I thought that i would be able to backup the data, but i got ext4
> error aswell, and when i tried to repair that with fsck i got:
> "
> # fsck -n /dev/md0
> fsck.ext4: Attempt to read block from filesystem resulted in short
> read while trying to open /dev/md0
> Could this be a zero-length partition?
> "
My fu isn't good here but I strongly suspect the read failed with an
"array not running" problem ...
> 
> So i am wondering if my mdadm raid is okay.
> The "State  [clean|active]" and "Array State : AA.."    is not so easy
> to interpret, tried to read parts of the threads, but at the same time
> is worried that more disks should failt... And I'm starting to get
> really stressed :(

All the more reason to ddrescue your disks ...
> 
> All the disk are the same type.  And apparently does not support SCT,
> which I was not aware of before.
> /dev/sde2  seems to be gone.
> 
Can you check the drive in another system? Is it the drive, or is it a
controller issue?

The fact that the three event counts we have are near-identical is a
good sign. The worry is that sde2 disappeared a long time ago - have you
been monitoring the system? If you ddrescue it will it give an event
count almost the same as the others? If it does, that makes me suspect a
controller issue has knocked two drives out, one of which has recovered
and the other hasn't ...
> "
> cat /proc/mdstat
> Personalities : [raid10]
> md0 : active raid10 sda2[0] sde2[3](F) sdc2[2](F) sdb2[1]
>       5840999424 blocks super 1.2 512K chunks 2 near-copies [4/2] [UU__]
> "

<snip>
> 
> 
> I really hope someone can help me!
> 
https://raid.wiki.kernel.org/index.php/Linux_Raid#When_Things_Go_Wrogn

Note that when it says "use the latest version of mdadm" it means it - I
suspect your version may be well out-of-date.

Give us a bit more information, especially the version of mdadm you're
using. See if you can ddrescue /dev/sde, and what that tells us, and I
strongly suspect a forced assembly of (copies of) your surviving disks
will recover almost everything.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Squeeze raid 1 0
  2020-01-14  7:23 ` Wols Lists
@ 2020-01-14  8:08   ` Gandalf Corvotempesta
  2020-01-14 23:11   ` Rickard Svensson
  1 sibling, 0 replies; 7+ messages in thread
From: Gandalf Corvotempesta @ 2020-01-14  8:08 UTC (permalink / raw
  To: Wols Lists; +Cc: Rickard Svensson, Linux RAID Mailing List

Il giorno mar 14 gen 2020 alle ore 08:24 Wols Lists
<antlists@youngman.org.uk> ha scritto:
> The fact that the three event counts we have are near-identical is a
> good sign. The worry is that sde2 disappeared a long time ago - have you
> been monitoring the system? If you ddrescue it will it give an event
> count almost the same as the others? If it does, that makes me suspect a
> controller issue has knocked two drives out, one of which has recovered
> and the other hasn't ...

But as I can see, two drives are out on a RAID1+0 array, so the array is gone.
I think we can only hope that these two disks were on different
mirrors (but this isn't
easily readable on mdadm output, I think i've sent an email asking for
a better output
some days ago)

That's why I really really really hate anything with redundancy < 2,
like a standard RAID1+0
Only 3way mirrors or RAID6 here.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Squeeze raid 1 0
  2020-01-14  7:23 ` Wols Lists
  2020-01-14  8:08   ` Gandalf Corvotempesta
@ 2020-01-14 23:11   ` Rickard Svensson
  2020-01-15  0:06     ` Wols Lists
  1 sibling, 1 reply; 7+ messages in thread
From: Rickard Svensson @ 2020-01-14 23:11 UTC (permalink / raw
  To: Wols Lists, linux-raid

Hi, I'm very grateful for all  help!

The Debian 6  mdadm version is:
mdadm - v3.1.4 - 31st August 2010

I have avoided doing much with the server...
And the server is still running, did not want to stop it...  But I
should stop it now?

Attaches  below a summary in the log, /sde died by the 9th, but came
back as /sdf  ???
And the 12th /sdc dies, and the morning after I discover the problem.
What I've done since then is only.
* Remont drive as read only
* Unmounted ext4, to run fsck
And that's when I realized it might be even worse.


My idea is to make a ddrescue copy of the problem disks, and then in a
new Debian 10 with new mdadm, try to start the raid on the new hd
copy..?

Yes, backing up via ddrescue sounds right.
BTW it is gddrescue?  ddrescue in Debian 10 seems to be a Windows
rescue program.

I'm change to raid 1 now on the server later on, I have two new 10Tb
drives, so not the same setup.
But I have a 6 Tb drive, which I intend to use for this rescue.

A question about the copy. is it possible to copy to a different
partition, for example copy sdc2 TO (new 6 TB disk) sdx1,  and then
sde2 TO (same new disk!) sdx2...
And mdadm should (with same luck) be able to put it to the same md0 device.
Or I'm asking, a copy of a partition will be the same, from what mdadm
is looking for?


"
Jan  9 00:14:44 ttserv kernel: [1503608.349157] sd 5:0:0:0: rejecting
I/O to offline device
Jan  9 00:14:44 ttserv kernel: [1503608.349199] sd 5:0:0:0: [sde]
killing request
Jan  9 00:14:44 ttserv kernel: [1503608.349225] sd 5:0:0:0: rejecting
I/O to offline device
Jan  9 00:14:44 ttserv kernel: [1503608.349232] ata6: hard resetting link
Jan  9 00:14:44 ttserv kernel: [1503608.349279] sd 5:0:0:0: rejecting
I/O to offline device
Jan  9 00:14:44 ttserv kernel: [1503608.349317] end_request: I/O
error, dev sde, sector 19531293
Jan  9 00:14:44 ttserv kernel: [1503608.349359] md: super_written gets
error=-5, uptodate=0
Jan  9 00:14:44 ttserv kernel: [1503608.349366] raid10: Disk failure
on sde2, disabling device.
Jan  9 00:14:44 ttserv kernel: [1503608.349368] raid10: Operation
continuing on 3 devices.
..
Jan  9 00:14:53 ttserv kernel: [1503617.083592] sd 5:0:0:0: [sdf]
5860533168 512-byte logical blocks: (3.00 TB/2.72 TiB)
Jan  9 00:14:53 ttserv kernel: [1503617.083599] sd 5:0:0:0: [sdf]
4096-byte physical blocks
Jan  9 00:14:53 ttserv kernel: [1503617.083745] sd 5:0:0:0: [sdf]
Write Protect is off
Jan  9 00:14:53 ttserv kernel: [1503617.083752] sd 5:0:0:0: [sdf] Mode
Sense: 00 3a 00 00
Jan  9 00:14:53 ttserv kernel: [1503617.083816] sd 5:0:0:0: [sdf]
Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jan  9 00:14:54 ttserv kernel: [1503617.084192]  sdf: sdf1 sdf2
Jan  9 00:14:54 ttserv kernel: [1503618.637572] sd 5:0:0:0: [sdf]
Attached SCSI disk
Jan  9 00:14:55 ttserv kernel: [1503619.097302] RAID10 conf printout:
Jan  9 00:14:55 ttserv kernel: [1503619.097310]  --- wd:3 rd:4
Jan  9 00:14:55 ttserv kernel: [1503619.097318]  disk 0, wo:0, o:1, dev:sda2
Jan  9 00:14:55 ttserv kernel: [1503619.097325]  disk 1, wo:0, o:1, dev:sdb2
Jan  9 00:14:55 ttserv kernel: [1503619.097331]  disk 2, wo:0, o:1, dev:sdc2
Jan  9 00:14:55 ttserv kernel: [1503619.097338]  disk 3, wo:1, o:0, dev:sde2
Jan  9 00:14:55 ttserv kernel: [1503619.140524] RAID10 conf printout:
Jan  9 00:14:55 ttserv kernel: [1503619.140530]  --- wd:3 rd:4
Jan  9 00:14:55 ttserv kernel: [1503619.140537]  disk 0, wo:0, o:1, dev:sda2
Jan  9 00:14:55 ttserv kernel: [1503619.140542]  disk 1, wo:0, o:1, dev:sdb2
Jan  9 00:14:55 ttserv kernel: [1503619.140547]  disk 2, wo:0, o:1, dev:sdc2
...
Jan 12 00:11:16 ttserv kernel: [1762600.077809] sd 2:0:0:0: [sdc]
Unhandled sense code
Jan 12 00:11:16 ttserv kernel: [1762600.077813] sd 2:0:0:0: [sdc]
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jan 12 00:11:16 ttserv kernel: [1762600.077820] sd 2:0:0:0: [sdc]
Sense Key : Medium Error [current] [descriptor]
Jan 12 00:11:16 ttserv kernel: [1762600.077828] Descriptor sense data
with sense descriptors (in hex):
Jan 12 00:11:16 ttserv kernel: [1762600.077832]         72 03 11 04 00
00 00 0c 00 0a 80 00 00 00 00 00
Jan 12 00:11:16 ttserv kernel: [1762600.077849]         7c 4b fd c8
Jan 12 00:11:16 ttserv kernel: [1762600.077856] sd 2:0:0:0: [sdc] Add.
Sense: Unrecovered read error - auto reallocate failed
Jan 12 00:11:16 ttserv kernel: [1762600.077865] sd 2:0:0:0: [sdc] CDB:
Read(10): 28 00 7c 4b fd c5 00 00 08 00
Jan 12 00:11:16 ttserv kernel: [1762600.077881] end_request: I/O
error, dev sdc, sector 2085354952
Jan 12 00:11:16 ttserv kernel: [1762600.077924] raid10: sdc2:
rescheduling sector 4131643312
...
Jan 12 00:11:32 ttserv kernel: [1762616.114440] sd 2:0:0:0: [sdc]
Unhandled sense code
Jan 12 00:11:32 ttserv kernel: [1762616.114445] sd 2:0:0:0: [sdc]
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jan 12 00:11:32 ttserv kernel: [1762616.114451] sd 2:0:0:0: [sdc]
Sense Key : Medium Error [current] [descriptor]
Jan 12 00:11:32 ttserv kernel: [1762616.114460] Descriptor sense data
with sense descriptors (in hex):
Jan 12 00:11:32 ttserv kernel: [1762616.114464]         72 03 11 04 00
00 00 0c 00 0a 80 00 00 00 00 00
Jan 12 00:11:32 ttserv kernel: [1762616.114481]         7c 4b fd c8
Jan 12 00:11:32 ttserv kernel: [1762616.114488] sd 2:0:0:0: [sdc] Add.
Sense: Unrecovered read error - auto reallocate failed
Jan 12 00:11:32 ttserv kernel: [1762616.114497] sd 2:0:0:0: [sdc] CDB:
Read(10): 28 00 7c 4b fd c5 00 00 08 00
Jan 12 00:11:32 ttserv kernel: [1762616.114513] end_request: I/O
error, dev sdc, sector 2085354952
Jan 12 00:11:32 ttserv kernel: [1762616.114575] raid10: Disk failure
on sdc2, disabling device.
Jan 12 00:11:32 ttserv kernel: [1762616.114579] raid10: Operation
continuing on 2 devices.
Jan 12 00:11:32 ttserv kernel: [1762616.114584] ata3: EH complete
Jan 12 00:11:32 ttserv kernel: [1762616.114692] raid10: sdc:
unrecoverable I/O read error for block 4131643312
Jan 12 00:11:32 ttserv kernel: [1762616.557061] RAID10 conf printout:
Jan 12 00:11:32 ttserv kernel: [1762616.557068]  --- wd:2 rd:4
Jan 12 00:11:32 ttserv kernel: [1762616.557075]  disk 0, wo:0, o:1, dev:sda2
Jan 12 00:11:32 ttserv kernel: [1762616.557080]  disk 1, wo:0, o:1, dev:sdb2
Jan 12 00:11:32 ttserv kernel: [1762616.557084]  disk 2, wo:1, o:0, dev:sdc2
Jan 12 00:11:32 ttserv kernel: [1762616.573021] RAID10 conf printout:
Jan 12 00:11:32 ttserv kernel: [1762616.573026]  --- wd:2 rd:4
Jan 12 00:11:32 ttserv kernel: [1762616.573032]  disk 0, wo:0, o:1, dev:sda2
Jan 12 00:11:32 ttserv kernel: [1762616.573037]  disk 1, wo:0, o:1, dev:sdb2
Jan 12 00:11:32 ttserv kernel: [1762616.573549] Buffer I/O error on
device md0, logical block 518896617
Jan 12 00:11:32 ttserv kernel: [1762616.573611] lost page write due to
I/O error on md0
Jan 12 00:11:37 ttserv kernel: [1762621.834664] JBD2: Detected IO
errors while flushing file data on md0-8
Jan 12 00:12:02 ttserv kernel: [1762647.011386] Buffer I/O error on
device md0, logical block 517494752
Jan 12 00:12:02 ttserv kernel: [1762647.011429] lost page write due to
I/O error on md0
"

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Squeeze raid 1 0
  2020-01-14 23:11   ` Rickard Svensson
@ 2020-01-15  0:06     ` Wols Lists
  2020-01-16 10:41       ` Rickard Svensson
  0 siblings, 1 reply; 7+ messages in thread
From: Wols Lists @ 2020-01-15  0:06 UTC (permalink / raw
  To: Rickard Svensson, linux-raid

On 14/01/20 23:11, Rickard Svensson wrote:
> Hi, I'm very grateful for all  help!
> 
> The Debian 6  mdadm version is:
> mdadm - v3.1.4 - 31st August 2010

Mmmmm ... yes using the new mdadm is very much advisable ...
> 
> I have avoided doing much with the server...
> And the server is still running, did not want to stop it...  But I
> should stop it now?

Have you got an esata port? Can you hook up the replacement drive(s) to
it? That sounds a good plan. You could use USB, but that's probably
going to be a LOT slower.

I can understand not wanting to shut the server down.
> 
> Attaches  below a summary in the log, /sde died by the 9th, but came
> back as /sdf  ???
> And the 12th /sdc dies, and the morning after I discover the problem.
> What I've done since then is only.
> * Remont drive as read only
> * Unmounted ext4, to run fsck
> And that's when I realized it might be even worse.
> 
Well, so long as nothing has written to the drives, and you can recover
a copy, then you should be okay ... cross fingers ...
> 
> My idea is to make a ddrescue copy of the problem disks, and then in a
> new Debian 10 with new mdadm, try to start the raid on the new hd
> copy..?

Yup
> 
> Yes, backing up via ddrescue sounds right.
> BTW it is gddrescue?  ddrescue in Debian 10 seems to be a Windows
> rescue program.
> 
Never heard of gddrescue. ddrescue is supposed to be a drop-in
replacement for dd, just that it doesn't error out on read failures and
has a large repertoire of tricks to try and get round errors if it can.

> I'm change to raid 1 now on the server later on, I have two new 10Tb
> drives, so not the same setup.
> But I have a 6 Tb drive, which I intend to use for this rescue.
> 
> A question about the copy. is it possible to copy to a different
> partition, for example copy sdc2 TO (new 6 TB disk) sdx1,  and then
> sde2 TO (same new disk!) sdx2...

Not a problem - raid (and linux in general) doesn't care about where the
data is, it just expects to be given a block device. It'll just slow
things down a bit. Hopefully with multiple heads per drive, not too
much, but I don't know in detail how these things work.

> And mdadm should (with same luck) be able to put it to the same md0 device.
> Or I'm asking, a copy of a partition will be the same, from what mdadm
> is looking for?
> 
Probably /dev/md126. At the end of the day, you shouldn't care. All you
want to do is assemble the array, see what it gives you as the array
device, and mount that. That should give your ext filesystem back. Run a
"no modify" fsck over it, and if it looks pretty clean (there might be a
little bit of corruption) try mounting it ro and looking it over for
problems.

When you move over to your 10TB drives (will that be a straight raid-1?)
look at dm-integrity (warning - it's experimental with raid but seems
solid for raid-1). And look at using named, not numbered, arrays. My
raid 1's are called /dev/root, /dev/home, and /dev/var.

(Fixed number raid arrays are deprecated - it counts down from 126 by
default now.)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Squeeze raid 1 0
  2020-01-15  0:06     ` Wols Lists
@ 2020-01-16 10:41       ` Rickard Svensson
  2020-01-28 21:57         ` Rickard Svensson
  0 siblings, 1 reply; 7+ messages in thread
From: Rickard Svensson @ 2020-01-16 10:41 UTC (permalink / raw
  To: Wols Lists; +Cc: linux-raid

Hi, thanks again :)

The server is shut down, will copy the two broken disks.
I think (and hope) there has been a small amount of writing since the
problems occurred.

Unexpected problem with the names of ddrescue, but the apt-get package
in Debian called gddrescue, the program is called ddrescue.
I became uncertain because you mention that it works like dd, but it
doesn't use  if=foo of=bar  like regular dd?
Anyway the program  ddrescue --help  refers to the homepage
http://www.gnu.org/software/ddrescue/ddrescue.html  which I assume is
right one..?
And there are a lot of options, any tips on some special ones I should use?

I also wonder if it is right to let mdadm try to recover from all four
disks, the first one to stop working where three days before I
discovered it.
Isn't it better to just use three disks, the two disks that are ok,
and the last disk that got too many write errors the night before I
discovered everything?

Otherwise, you have confirmed/clarified that everything seems to work
the way I hoped.
And I will read up on all the news in mdadm, and dm-integrity sounds
interesting. Thanks!

Cheers Rickard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Squeeze raid 1 0
  2020-01-16 10:41       ` Rickard Svensson
@ 2020-01-28 21:57         ` Rickard Svensson
  0 siblings, 0 replies; 7+ messages in thread
From: Rickard Svensson @ 2020-01-28 21:57 UTC (permalink / raw
  To: Wols Lists; +Cc: linux-raid

Hi again Wol and more :)

Now I have succeeded, I hope.
This has taken some time, and the disk that was broken I had to make
different attempts with ddrescue.

I have only copied 3 of the 4 disks, not the one who got the wrong
three days before.
Because I assumed it contained old data.
Or am I think wrong, Is it better to add it to?

Otherwise, I guess I need to start the raid, or something.
Do I need a --forece maybe. I just don't want to do anything wrong
after all this :)


Debian 10   --  mdadm - v4.1 - 2018-10-01

# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : inactive sda1[2](S) sdb1[0](S) sdb2[1](S)
      8761499679 blocks super 1.2
unused devices: <none>


# mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 3
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 3

              Name : ttserv:0
              UUID : cb5bfe7a:3806324c:3c1e7030:e6267102
            Events : 2719

    Number   Major   Minor   RaidDevice

       -       8        1        -        /dev/sda1
       -       8       18        -        /dev/sdb2
       -       8       17        -        /dev/sdb1


# mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : cb5bfe7a:3806324c:3c1e7030:e6267102
           Name : ttserv:0
  Creation Time : Tue Oct  9 23:30:23 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 5840999786 (2785.21 GiB 2990.59 GB)
     Array Size : 5840999424 (5570.41 GiB 5981.18 GB)
  Used Dev Size : 5840999424 (2785.21 GiB 2990.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=20538368 sectors
          State : active
    Device UUID : 23079ae3:c67969c2:13299e27:8ca3cf7f

    Update Time : Sun Jan 12 00:11:05 2020
       Checksum : ed375eb5 - correct
         Events : 2719

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)



# mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : cb5bfe7a:3806324c:3c1e7030:e6267102
           Name : ttserv:0
  Creation Time : Tue Oct  9 23:30:23 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 5840999786 (2785.21 GiB 2990.59 GB)
     Array Size : 5840999424 (5570.41 GiB 5981.18 GB)
  Used Dev Size : 5840999424 (2785.21 GiB 2990.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=19520512 sectors
          State : clean
    Device UUID : f474ad64:6bb236d3:9f69f55c:eb9b8c27

    Update Time : Tue Jan 14 22:49:49 2020
       Checksum : 5c312015 - correct
         Events : 2864

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)


# mdadm --examine /dev/sdb2
/dev/sdb2:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : cb5bfe7a:3806324c:3c1e7030:e6267102
           Name : ttserv:0
  Creation Time : Tue Oct  9 23:30:23 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 5840999786 (2785.21 GiB 2990.59 GB)
     Array Size : 5840999424 (5570.41 GiB 5981.18 GB)
  Used Dev Size : 5840999424 (2785.21 GiB 2990.59 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=19519631 sectors
          State : clean
    Device UUID : 1a67153d:8d15019a:349926d5:e22dd321

    Update Time : Tue Jan 14 22:49:49 2020
       Checksum : 6ddb36ea - correct
         Events : 2864

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)



Cheers Rickard


Den tors 16 jan. 2020 kl 11:41 skrev Rickard Svensson <myhex2020@gmail.com>:
>
> Hi, thanks again :)
>
> The server is shut down, will copy the two broken disks.
> I think (and hope) there has been a small amount of writing since the
> problems occurred.
>
> Unexpected problem with the names of ddrescue, but the apt-get package
> in Debian called gddrescue, the program is called ddrescue.
> I became uncertain because you mention that it works like dd, but it
> doesn't use  if=foo of=bar  like regular dd?
> Anyway the program  ddrescue --help  refers to the homepage
> http://www.gnu.org/software/ddrescue/ddrescue.html  which I assume is
> right one..?
> And there are a lot of options, any tips on some special ones I should use?
>
> I also wonder if it is right to let mdadm try to recover from all four
> disks, the first one to stop working where three days before I
> discovered it.
> Isn't it better to just use three disks, the two disks that are ok,
> and the last disk that got too many write errors the night before I
> discovered everything?
>
> Otherwise, you have confirmed/clarified that everything seems to work
> the way I hoped.
> And I will read up on all the news in mdadm, and dm-integrity sounds
> interesting. Thanks!
>
> Cheers Rickard

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-01-28 21:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-01-13 23:34 Debian Squeeze raid 1 0 Rickard Svensson
2020-01-14  7:23 ` Wols Lists
2020-01-14  8:08   ` Gandalf Corvotempesta
2020-01-14 23:11   ` Rickard Svensson
2020-01-15  0:06     ` Wols Lists
2020-01-16 10:41       ` Rickard Svensson
2020-01-28 21:57         ` Rickard Svensson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.