cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Alexander Aring <aahringo@redhat.com>
To: Jeff Layton <jlayton@kernel.org>
Cc: cluster-devel@redhat.com, linux-nfs@vger.kernel.org,
	neilb@suse.de, Dai.Ngo@oracle.com, tom@talpey.com,
	kolga@netapp.com, chuck.lever@oracle.com, anna@kernel.org,
	trond.myklebust@hammerspace.com
Subject: Re: [Cluster-devel] [RFC v6.5-rc2 2/3] fs: lockd: fix race in async lock request handling
Date: Thu, 10 Aug 2023 17:00:49 -0400	[thread overview]
Message-ID: <CAK-6q+gBnuQ9LkfM04SCtOkRsqjjQLseh+wZ5B_K4Sxmr2FQsw@mail.gmail.com> (raw)
In-Reply-To: <44d48d5a78159bcf8d52d3213ac6d684e148ebfd.camel@kernel.org>

Hi,

On Fri, Jul 21, 2023 at 12:43 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Fri, 2023-07-21 at 09:09 -0400, Alexander Aring wrote:
> > Hi,
> >
> > On Thu, Jul 20, 2023 at 8:58 AM Alexander Aring <aahringo@redhat.com> wrote:
> > >
> > > This patch fixes a race in async lock request handling between adding
> > > the relevant struct nlm_block to nlm_blocked list after the request was
> > > sent by vfs_lock_file() and nlmsvc_grant_deferred() does a lookup of the
> > > nlm_block in the nlm_blocked list. It could be that the async request is
> > > completed before the nlm_block was added to the list. This would end
> > > in a -ENOENT and a kernel log message of "lockd: grant for unknown
> > > block".
> > >
> > > To solve this issue we add the nlm_block before the vfs_lock_file() call
> > > to be sure it has been added when a possible nlmsvc_grant_deferred() is
> > > called. If the vfs_lock_file() results in an case when it wouldn't be
> > > added to nlm_blocked list, the nlm_block struct will be removed from
> > > this list again.
> > >
> > > Signed-off-by: Alexander Aring <aahringo@redhat.com>
> > > ---
> > >  fs/lockd/svclock.c          | 80 +++++++++++++++++++++++++++----------
> > >  include/linux/lockd/lockd.h |  1 +
> > >  2 files changed, 60 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
> > > index 28abec5c451d..62ef27a69a9e 100644
> > > --- a/fs/lockd/svclock.c
> > > +++ b/fs/lockd/svclock.c
> > > @@ -297,6 +297,8 @@ static void nlmsvc_free_block(struct kref *kref)
> > >
> > >         dprintk("lockd: freeing block %p...\n", block);
> > >
> > > +       WARN_ON_ONCE(block->b_flags & B_PENDING_CALLBACK);
> > > +
> > >         /* Remove block from file's list of blocks */
> > >         list_del_init(&block->b_flist);
> > >         mutex_unlock(&file->f_mutex);
> > > @@ -543,6 +545,12 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
> > >                 goto out;
> > >         }
> > >
> > > +       if (block->b_flags & B_PENDING_CALLBACK)
> > > +               goto pending_request;
> > > +
> > > +       /* Append to list of blocked */
> > > +       nlmsvc_insert_block(block, NLM_NEVER);
> > > +
> > >         if (!wait)
> > >                 lock->fl.fl_flags &= ~FL_SLEEP;
> > >         mode = lock_to_openmode(&lock->fl);
> > > @@ -552,9 +560,13 @@ nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
> > >         dprintk("lockd: vfs_lock_file returned %d\n", error);
> > >         switch (error) {
> > >                 case 0:
> > > +                       nlmsvc_remove_block(block);
> >
> > reacting here with nlmsvc_remove_block() assumes that the block was
> > not being added to the nlm_blocked list before nlmsvc_insert_block()
> > was called. I am not sure if this is always the case here.
> >
> > Does somebody see a problem with that?
> >
>
> The scenario is: we have a block on the list already and now another
> lock request comes in for the same thing: the client decided to just re-
> poll for the lock. That's a plausible scenario. I think the Linux NLM
> client will poll for locks periodically.
>
> In this case though, the lock request was granted by the filesystem, so
> this is likely racing with (and winning vs.) a lm_grant callback. Given
> that the client decided to repoll for it, we're probably safe to just
> dequeue the block and respond here, and not worry about sending a grant
> callback.
>
> Ditto for the other cases where the block is removed.
>

ok.

> > >                         ret = nlm_granted;
> > >                         goto out;
> > >                 case -EAGAIN:
> > > +                       if (!wait)
> > > +                               nlmsvc_remove_block(block);
>
> I was thinking that it would be best to not insert a block at all in the
> !wait case, but it looks like DLM just returns DEFERRED and almost
> always does a callback, even when it's not a blocking lock request?
>
> Anyway, I think we probably do have to handle this like you are.
>

I would prefer to have !wait blocked. We even don't do that in DLM, it
causes problems with cancellation as a cancellation will only do
something (at least in DLM) when there is a waiter that the lock
request waits to be granted, which is only being the case for wait
lock requests.

A !wait is only a trylock, the answer should be back being mostly
immediate and it also makes no sense for me to make them async,
because we have the same problems with cancellation/unlock which are
not being offered to be handled in an asynchronous way. As I said, the
answer should be back mostly immediately. We are somehow doing this
optimization for !wait lock requests only, but operations like unlock
are also being called by lockd and are not being handled
asynchronously. That means we probably don't care about this
optimization, it looks different on wait lock requests.

We should update the documentation and only do async lock requests on
wait only. Is this okay?

- Alex


  reply	other threads:[~2023-08-10 21:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-20 12:58 [Cluster-devel] [RFC v6.5-rc2 1/3] fs: lockd: nlm_blocked list race fixes Alexander Aring
2023-07-20 12:58 ` [Cluster-devel] [RFC v6.5-rc2 2/3] fs: lockd: fix race in async lock request handling Alexander Aring
2023-07-21 13:09   ` Alexander Aring
2023-07-21 16:43     ` Jeff Layton
2023-08-10 21:00       ` Alexander Aring [this message]
2023-07-21 15:45   ` Jeff Layton
2023-08-10 20:37     ` Alexander Aring
2023-07-20 12:58 ` [Cluster-devel] [RFC v6.5-rc2 3/3] fs: lockd: introduce safe async lock op Alexander Aring
2023-07-21 17:46   ` Jeff Layton
2023-08-10 20:24     ` Alexander Aring
2023-07-21 15:14 ` [Cluster-devel] [RFC v6.5-rc2 1/3] fs: lockd: nlm_blocked list race fixes Jeff Layton
2023-07-21 16:59 ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAK-6q+gBnuQ9LkfM04SCtOkRsqjjQLseh+wZ5B_K4Sxmr2FQsw@mail.gmail.com \
    --to=aahringo@redhat.com \
    --cc=Dai.Ngo@oracle.com \
    --cc=anna@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=cluster-devel@redhat.com \
    --cc=jlayton@kernel.org \
    --cc=kolga@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=tom@talpey.com \
    --cc=trond.myklebust@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).