Target-devel archive mirror
 help / color / mirror / Atom feed
From: Maurizio Lombardi <mlombard@redhat.com>
To: Mike Christie <michael.christie@oracle.com>
Cc: martin.petersen@oracle.com, serapheim.dimitro@delphix.com,
	target-devel@vger.kernel.org
Subject: Re: [PATCH 1/3] target: iscsi: fix hang in the iSCSI login code
Date: Tue, 21 Mar 2023 18:13:42 +0100	[thread overview]
Message-ID: <CAFL455m+Kt9fsojo5sYTt8wMmDQRr2TRvOO=+_0BXMeOK3X_Rw@mail.gmail.com> (raw)
In-Reply-To: <afb25e86-3b1a-3b19-f257-e748d0900005@oracle.com>

út 14. 3. 2023 v 22:23 odesílatel Mike Christie
<michael.christie@oracle.com> napsal:
>
> On 3/14/23 6:09 AM, Maurizio Lombardi wrote:
> > út 14. 3. 2023 v 0:52 odesílatel Mike Christie
> > <michael.christie@oracle.com> napsal:
> >>
> >>> +     case TCP_CLOSE:
> >>> +             pr_debug("__iscsi_target_sk_check_close: socket closing,"
> >>>                       "returning TRUE\n");
> >>
> >> Don't need to break up a string. We do it a lot in the lio code, but we've
> >> been trying not to in new code.
> >>
> >>> +             /*
> >>> +              * Restart the login timer to prevent the
> >>> +              * login process from getting stuck if the initiator
> >>
> >> I would fix up the formatting so the first line is longer.
> >
> > Ok
> >
> >>> @@ -1358,6 +1348,9 @@ int iscsi_target_start_negotiation(
> >>>               set_bit(LOGIN_FLAGS_INITIAL_PDU, &conn->login_flags);
> >>>               write_unlock_bh(&sk->sk_callback_lock);
> >>>       }
> >>> +
> >>> +     iscsit_start_login_timer(conn);
> >>
> >> At this time, we have the np->np_login_timer running right?
> >
> > Yes.
> >
> >>
> >> Don't we only need to start this new timer when we know there are
> >> more PDUs and the connection is good (iscsi_target_do_login returns
> >> 0 and iscsi_target_sk_check_and_clear returns 0)?
> >
> > The moment iscsi_target_sk_check_and_clear() clears the
> > LOGIN_FLAGS_INITIAL_PDU flag
> > and returns 0, the login worker may be already running.
> > If we start the timer after the call to
> > iscsi_target_sk_check_and_clear(), we could have a race condition:
> >
> > 1) login_work runs and reschedules itself non-stop because
> > LOGIN_FLAGS_INITIAL_PDU is set
> > 2) login kthread calls  iscsi_target_sk_check_and_clear() and clears
> > LOGIN_FLAGS_INITIAL_PDU
> > 3) login work runs and completes the login
> > 4) login kthread starts the timer
> > 5) No one stops the timer, it fires and kills the connection despite
> > the fact the login was successful.
> >
> > I could however replace this code:
> >
> > ret = iscsi_target_do_login(conn, login);
> >  if (!ret && iscsi_target_sk_check_and_clear(conn, LOGIN_FLAGS_INITIAL_PDU))
> >            ret = -1;
> >
> > with the following, if you like it more:
> >
> > ret = iscsi_target_do_login(conn, login);
> > if (!ret) {
> >       iscsit_start_login_timer(conn);
> >       if (iscsi_target_sk_check_and_clear(conn, LOGIN_FLAGS_INITIAL_PDU)) {
> >            iscsit_stop_login_timer(conn);
> >            ret = -1;
> >       }
> > }
>
> Ah yeah, I wasn't thinking specifically about this race when I wrote the
> above comment. With the combined timer below, I was thinking this is handled
> when you set/check the login_kworker thread.
>
> >
> >>
> >> I think you can just kill np timer and only use the login_timer for
> >> both cases. So I mean set the thread to kill as the login one and start
> >> this login_timer in __iscsi_target_login_thread where we used to call
> >> iscsi_start_login_thread_timer. You would then mod the timer when we
> >> transition from iscsi_target_start_negotiation to waiting for the next
> >> PDU.
> >>
> >
> > Yes, maybe, but I would need to find a way to detect if conn->login_kworker
> > is pointing to the login thread or to the login_work's thread, because
> > the np_login_timer is supposed to clear the ISCSI_TF_RUNNING flag.
> >
> > maybe something like this:
> >
> > if (conn->login_kworker == conn->tpg_np->tpg_np->np_thread) {
> >      spin_lock_bh(&np->np_thread_lock);
> >      if (!(np->np_login_timer_flags & ISCSI_TF_STOP))
> >            np->np_login_timer_flags &= ~ISCSI_TF_RUNNING;
> >      spin_unlock_bh(&np->np_thread_lock);
> > }
>
> We don't need any of the np_login_timer_flags code if we are using your per
> conn login_timer do we?

Ah yes you're right, I was just confused by all those
"ISCSI_TF_RUNNING/STOP" flags used all
over the place, then I realized that every timer has its own flags.

>
> For the new timer:
> - We are adding one per conn timer.
> - We use that for both the initial pdu and later ones.
> - The timeout function, sends a signal if there is a thread set or does whatever
> we figure out below for the case where there is no thread (we don't do any
> np_login_timer_flags stuff).
> - We probably don't need to do both the signal and whatever we decide below.
> Or, we need to check some of the LOGIN_FLAGS since for example we don't
> need to queue the login_work and set LOGIN_FLAGS_CLOSED if LOGIN_FLAGS_INITIAL_PDU
> is set.
> - The iscsi_start_login_timer function handles setting the login_kworker thread.

Ok I have a V2 almost ready, I'm testing it right now.

Maurizio


  reply	other threads:[~2023-03-21 17:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-10 10:04 [PATCH 0/3] Fix possible hangs and race conditions during login Maurizio Lombardi
2023-03-10 10:04 ` [PATCH 1/3] target: iscsi: fix hang in the iSCSI login code Maurizio Lombardi
2023-03-10 19:53   ` Mike Christie
2023-03-13  9:06     ` Maurizio Lombardi
2023-03-13 23:52   ` Mike Christie
2023-03-14 11:09     ` Maurizio Lombardi
2023-03-14 21:23       ` Mike Christie
2023-03-21 17:13         ` Maurizio Lombardi [this message]
2023-03-10 10:04 ` [PATCH 2/3] target: iscsi: remove unused transport_timer Maurizio Lombardi
2023-03-10 10:04 ` [PATCH 3/3] target: iscsi: prevent login threads from racing between each other Maurizio Lombardi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFL455m+Kt9fsojo5sYTt8wMmDQRr2TRvOO=+_0BXMeOK3X_Rw@mail.gmail.com' \
    --to=mlombard@redhat.com \
    --cc=martin.petersen@oracle.com \
    --cc=michael.christie@oracle.com \
    --cc=serapheim.dimitro@delphix.com \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).