[lustre-devel] Strange behavior with tunefs.lustre. Is this a bug?

Lustre-devel archive mirror
 help / color / mirror / Atom feed

From: Backer via lustre-devel <lustre-devel@lists.lustre.org>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] Strange behavior with tunefs.lustre. Is this a bug?
Date: Tue, 23 Jan 2024 12:02:44 -0500	[thread overview]
Message-ID: <CAPq+oAL+kCR6aeDVjqa5t2L7nPrVYfyxACu_i47v4_uDEpY_2Q@mail.gmail.com> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 7674 bytes --]

I am seeing a behavior with tunefs.lustre. After changing the failover node
and trying to mount an OST, getting getting the following error:

The target service's index is already in use. (/dev/sdd)


After the above error, and performing --writeconf once, I can repeat these
steps (see below) any number of times and any OSS without --writeconf.


This is an effort to mount an OST to a new OSS. I reproduced this issue
after simplifying some steps and reproducing the behavior (see below)
consistently. I was wondering if anyone could help me to understand this?

[root@OSS-2 opc]# lctl list_nids

10.99.101.18@tcp1

[root@OSS-2 opc]#


[root@OSS-2 opc]# mkfs.lustre --reformat  --ost --fsname="testfs"
--index="64"  --mgsnode "10.99.101.6@tcp1" --mgsnode "10.99.101.7@tcp1"
--servicenode "10.99.101.18@tcp1" "/dev/sdd"


   Permanent disk data:

Target:     testfs:OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1062

              (OST first_time update no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1


device size = 51200MB

formatting backing filesystem ldiskfs on /dev/sdd

target name   testfs:OST0040

kilobytes     52428800

options        -J size=1024 -I 512 -i 69905 -q -O
extents,uninit_bg,mmp,dir_nlink,quota,project,huge_file,^fast_commit,flex_bg
-G 256 -E resize="4290772992",lazy_journal_init="0",lazy_itable_init="0" -F

mkfs_cmd = mke2fs -j -b 4096 -L testfs:OST0040  -J size=1024 -I 512 -i
69905 -q -O
extents,uninit_bg,mmp,dir_nlink,quota,project,huge_file,^fast_commit,flex_bg
-G 256 -E resize="4290772992",lazy_journal_init="0",lazy_itable_init="0" -F
/dev/sdd 52428800k

Writing CONFIGS/mountdata


[root@OSS-2 opc]# tunefs.lustre --dryrun /dev/sdd

checking for existing Lustre data: found


   Read previous values:

Target:     testfs-OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1062

              (OST first_time update no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1



   Permanent disk data:

Target:     testfs:OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1062

              (OST first_time update no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1


exiting before disk write.

[root@OSS-2 opc]#


[root@OSS-2 opc]# tunefs.lustre --erase-param failover.node --servicenode
10.99.101.18@tcp1 /dev/sdd

checking for existing Lustre data: found


   Read previous values:

Target:     testfs-OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1062

              (OST first_time update no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1



   Permanent disk data:

Target:     testfs:OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1062

              (OST first_time update no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1


Writing CONFIGS/mountdata


[root@OSS-2 opc]# mkdir /testfs-OST0040

[root@OSS-2 opc]# mount -t lustre /dev/sdd  /testfs-OST0040

mount.lustre: increased
'/sys/devices/platform/host5/session3/target5:0:0/5:0:0:1/block/sdd/queue/max_sectors_kb'
from 1024 to 16384

[root@OSS-2 opc]#


[root@OSS-2 opc]# tunefs.lustre --dryrun /dev/sdd

checking for existing Lustre data: found


   Read previous values:

Target:     testfs-OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1002

              (OST no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1



   Permanent disk data:

Target:     testfs-OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1002

              (OST no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1


exiting before disk write.

[root@OSS-2 opc]#



Going over to OSS-3 and trying to mount OST.  At this stage OSS-2 is
completely powered off.



[root@OSS-3 opc]# lctl list_nids

10.99.101.19@tcp1

[root@OSS-3 opc]#


Parameters looks same as OSS-2


[root@OSS-3 opc]#  tunefs.lustre --dryrun /dev/sdd

checking for existing Lustre data: found


   Read previous values:

Target:     testfs-OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1002

              (OST no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1



   Permanent disk data:

Target:     testfs-OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1002

              (OST no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1


exiting before disk write.

[root@OSS-3 opc]#


Changing failover node to current node.


[root@OSS-3 opc]# tunefs.lustre --erase-param failover.node --servicenode
10.99.101.19@tcp1 /dev/sdd

checking for existing Lustre data: found


   Read previous values:

Target:     testfs-OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1002

              (OST no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.18@tcp1



   Permanent disk data:

Target:     testfs-OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1042

              (OST update no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.19@tcp1



<waits here for MPP time out (multi mount protection>

After it completes the write, for some reason this OST is being marked as
'first_time' flag 0x1062 in next command.

[root@OSS-3 opc]#  tunefs.lustre --dryrun /dev/sdd

checking for existing Lustre data: found


   Read previous values:

Target:     testfs-OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1062

              (OST first_time update no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.19@tcp1



   Permanent disk data:

Target:     testfs:OST0040

Index:      64

Lustre FS:  testfs

Mount type: ldiskfs

Flags:      0x1062

              (OST first_time update no_primnode )

Persistent mount opts: ,errors=remount-ro

Parameters:  mgsnode=10.99.101.6@tcp1:10.99.101.7@tcp1
failover.node=10.99.101.19@tcp1


exiting before disk write.

[root@OSS-3 opc]#




Mount doesn't work here because it is marked as first time and this OST is
not first time as it was already mounted using OST-2 OSS, and MGS knows
about it.

[root@OSS-3 opc]#  mkdir /testfs-OST0040

[root@OSS-3 opc]# mount -t lustre /dev/sdd  /testfs-OST0040

mount.lustre: mount /dev/sdd at /testfs-OST0040 failed: Address already in
use

The target service's index is already in use. (/dev/sdd)

[root@OSS-3 opc]#


From here, if I do tunefs.lustre with --writeconf, it works. Once this is
done, repeating the above experiment any number of times on any servers
works fine as expected without using --writeconf. (FYI Note: --writeconfig
is mentioned as a dangerous command)

[-- Attachment #1.2: Type: text/html, Size: 76804 bytes --]

[-- Attachment #2: Type: text/plain, Size: 165 bytes --]

_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

                 reply	other threads:[~2024-01-23 17:03 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPq+oAL+kCR6aeDVjqa5t2L7nPrVYfyxACu_i47v4_uDEpY_2Q@mail.gmail.com \
    --to=lustre-devel@lists.lustre.org \
    --cc=backer.kolo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).