Linux-Fsdevel Archive mirror
 help / color / mirror / Atom feed
From: Steve French <smfrench@gmail.com>
To: David Howells <dhowells@redhat.com>
Cc: Steve French <stfrench@microsoft.com>,
	Jeff Layton <jlayton@kernel.org>,
	 Enzo Matsumiya <ematsumiya@suse.de>,
	Christian Brauner <brauner@kernel.org>,
	netfs@lists.linux.dev,  v9fs@lists.linux.dev,
	linux-afs@lists.infradead.org,  linux-cifs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] netfs: Fix io_uring based write-through
Date: Fri, 24 May 2024 10:32:44 -0500	[thread overview]
Message-ID: <CAH2r5mvHZLLb_hJmXGvWbKDNKoLPwkXO5wFpOM0eTZkKvtV5Ug@mail.gmail.com> (raw)
In-Reply-To: <CAH2r5muWt2X55sVfk6Ngct-+c_SFepPzQdhUiZNQT+o_twiivw@mail.gmail.com>

Can add my Tested-by if you would like.  Hopefully this (and "netfs:
Fix setting of BDP_ASYNC from iocb flags") can be in mainline
relatively soon as they both fix some issues we hit with netfs testing
cifs.ko.   There are still a few more netfs bugs to work through but
this is progress.

On Tue, May 21, 2024 at 9:06 PM Steve French <smfrench@gmail.com> wrote:
>
> fixed minor checpatch warning (updated patch attached)
>
> On Tue, May 21, 2024 at 7:02 PM David Howells <dhowells@redhat.com> wrote:
> >
> > This can be triggered by mounting a cifs filesystem with a cache=strict
> > mount option and then, using the fsx program from xfstests, doing:
> >
> >         ltp/fsx -A -d -N 1000 -S 11463 -P /tmp /cifs-mount/foo \
> >           --replay-ops=gen112-fsxops
> >
> > Where gen112-fsxops holds:
> >
> >         fallocate 0x6be7 0x8fc5 0x377d3
> >         copy_range 0x9c71 0x77e8 0x2edaf 0x377d3
> >         write 0x2776d 0x8f65 0x377d3
> >
> > The problem is that netfs_io_request::len is being used for two purposes
> > and ends up getting set to the amount of data we transferred, not the
> > amount of data the caller asked to be transferred (for various reasons,
> > such as mmap'd writes, we might end up rounding out the data written to the
> > server to include the entire folio at each end).
> >
> > Fix this by keeping the amount we were asked to write in ->len and using
> > ->submitted to track what we issued ops for.  Then, when we come to calling
> > ->ki_complete(), ->len is the right size.
> >
> > This also required netfs_cleanup_dio_write() to change since we're no
> > longer advancing wreq->len.  Use wreq->transferred instead as we might have
> > done a short read and wreq->len must be set when setting up a direct write.
> >
> > With this, the generic/112 xfstest passes if cifs is forced to put all
> > non-DIO opens into write-through mode.
> >
> > Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
> > Signed-off-by: David Howells <dhowells@redhat.com>
> > cc: Jeff Layton <jlayton@kernel.org>
> > cc: Steve French <stfrench@microsoft.com>
> > cc: Enzo Matsumiya <ematsumiya@suse.de>
> > cc: netfs@lists.linux.dev
> > cc: v9fs@lists.linux.dev
> > cc: linux-afs@lists.infradead.org
> > cc: linux-cifs@vger.kernel.org
> > cc: linux-fsdevel@vger.kernel.org
> > Link: https://lore.kernel.org/r/295086.1716298663@warthog.procyon.org.uk/ # v1
> > ---
> >  Changes
> >  =======
> >  ver #2)
> >   - Set wreq->len when doing direct writes.
> >
> >  fs/netfs/direct_write.c  |    5 +++--
> >  fs/netfs/write_collect.c |    7 ++++---
> >  fs/netfs/write_issue.c   |    2 +-
> >  3 files changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
> > index 608ba6416919..93b41e121042 100644
> > --- a/fs/netfs/direct_write.c
> > +++ b/fs/netfs/direct_write.c
> > @@ -12,7 +12,7 @@
> >  static void netfs_cleanup_dio_write(struct netfs_io_request *wreq)
> >  {
> >         struct inode *inode = wreq->inode;
> > -       unsigned long long end = wreq->start + wreq->len;
> > +       unsigned long long end = wreq->start + wreq->transferred;
> >
> >         if (!wreq->error &&
> >             i_size_read(inode) < end) {
> > @@ -92,8 +92,9 @@ static ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov
> >         __set_bit(NETFS_RREQ_UPLOAD_TO_SERVER, &wreq->flags);
> >         if (async)
> >                 wreq->iocb = iocb;
> > +       wreq->len = iov_iter_count(&wreq->io_iter);
> >         wreq->cleanup = netfs_cleanup_dio_write;
> > -       ret = netfs_unbuffered_write(wreq, is_sync_kiocb(iocb), iov_iter_count(&wreq->io_iter));
> > +       ret = netfs_unbuffered_write(wreq, is_sync_kiocb(iocb), wreq->len);
> >         if (ret < 0) {
> >                 _debug("begin = %zd", ret);
> >                 goto out;
> > diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c
> > index 60112e4b2c5e..426cf87aaf2e 100644
> > --- a/fs/netfs/write_collect.c
> > +++ b/fs/netfs/write_collect.c
> > @@ -510,7 +510,7 @@ static void netfs_collect_write_results(struct netfs_io_request *wreq)
> >          * stream has a gap that can be jumped.
> >          */
> >         if (notes & SOME_EMPTY) {
> > -               unsigned long long jump_to = wreq->start + wreq->len;
> > +               unsigned long long jump_to = wreq->start + READ_ONCE(wreq->submitted);
> >
> >                 for (s = 0; s < NR_IO_STREAMS; s++) {
> >                         stream = &wreq->io_streams[s];
> > @@ -690,10 +690,11 @@ void netfs_write_collection_worker(struct work_struct *work)
> >         wake_up_bit(&wreq->flags, NETFS_RREQ_IN_PROGRESS);
> >
> >         if (wreq->iocb) {
> > -               wreq->iocb->ki_pos += wreq->transferred;
> > +               size_t written = min(wreq->transferred, wreq->len);
> > +               wreq->iocb->ki_pos += written;
> >                 if (wreq->iocb->ki_complete)
> >                         wreq->iocb->ki_complete(
> > -                               wreq->iocb, wreq->error ? wreq->error : wreq->transferred);
> > +                               wreq->iocb, wreq->error ? wreq->error : written);
> >                 wreq->iocb = VFS_PTR_POISON;
> >         }
> >
> > diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
> > index acbfd1f5ee9d..3aa86e268f40 100644
> > --- a/fs/netfs/write_issue.c
> > +++ b/fs/netfs/write_issue.c
> > @@ -254,7 +254,7 @@ static void netfs_issue_write(struct netfs_io_request *wreq,
> >         stream->construct = NULL;
> >
> >         if (subreq->start + subreq->len > wreq->start + wreq->submitted)
> > -               wreq->len = wreq->submitted = subreq->start + subreq->len - wreq->start;
> > +               WRITE_ONCE(wreq->submitted, subreq->start + subreq->len - wreq->start);
> >         netfs_do_issue_write(stream, subreq);
> >  }
> >
> >
>
>
> --
> Thanks,
>
> Steve



-- 
Thanks,

Steve

      reply	other threads:[~2024-05-24 15:32 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-22  0:02 [PATCH v2] netfs: Fix io_uring based write-through David Howells
2024-05-22  2:06 ` Steve French
2024-05-24 15:32   ` Steve French [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAH2r5mvHZLLb_hJmXGvWbKDNKoLPwkXO5wFpOM0eTZkKvtV5Ug@mail.gmail.com \
    --to=smfrench@gmail.com \
    --cc=brauner@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=ematsumiya@suse.de \
    --cc=jlayton@kernel.org \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netfs@lists.linux.dev \
    --cc=stfrench@microsoft.com \
    --cc=v9fs@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).