All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Gruenbacher <agruenba@redhat.com>,
	cluster-devel <cluster-devel@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Jan Kara <jack@suse.cz>, Matthew Wilcox <willy@infradead.org>
Subject: Re: [RFC 5/9] iov_iter: Add iov_iter_fault_in_writeable()
Date: Sat, 12 Jun 2021 14:33:31 -0700	[thread overview]
Message-ID: <CAHk-=whcnziOWqVESWKJ6Y1_sG2S2AOa1vv5yKzUGs5gM7qYpQ@mail.gmail.com> (raw)
In-Reply-To: <YMUjQYtBCIxHvsYV@zeniv-ca.linux.org.uk>

On Sat, Jun 12, 2021 at 2:12 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Actually, is there any good way to make sure that write fault is triggered
> _without_ modification of the data?  On x86 lock xadd (of 0, that is) would
> probably do it and some of the other architectures could probably get away
> with using cmpxchg and its relatives, but how reliable it is wrt always
> triggering a write fault if the page is currently read-only?

I wouldn't worry about the CPU optimizing a zero 'add' away (extra
work for no gain in any normal situation).

But any architecture using 'ldl/stc' to do atomics would do it in
software for at least cmpxchg (ie just abort after the "doesn't
match").

Even on x86, it's certainly _possible_ that a non-matching cmpxchg
might not have done the writability check, although I would find that
unlikely (ie I would expect it to do just one TLB lookup, and just one
permission check, whether it then writes or not).

And if the architecture does atomic operations using that ldl/stc
model, I could (again) see the software loop breaking out early
(before the stc) on the grounds of "value didn't change".

Although it's a lot less likely outside of cmpxchg. I suspect an "add
zero" would work just fine even on a ldl/stc model.

That said, reads are obviously much easier, and I'd probably prefer
the model for writes to be to not necessarily pre-fault anything at
all, but just write to user space with page faults disabled.

And then only if that fails do you do anything special. And at that
point, even walking the page tables by hand might be perfectly
acceptable - since we know it's going to fault anyway, and it might
actually be cheaper to just do it by hand with GUP or whatever.

          Linus

WARNING: multiple messages have this Message-ID (diff)
From: Linus Torvalds <torvalds@linux-foundation.org>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [RFC 5/9] iov_iter: Add iov_iter_fault_in_writeable()
Date: Sat, 12 Jun 2021 14:33:31 -0700	[thread overview]
Message-ID: <CAHk-=whcnziOWqVESWKJ6Y1_sG2S2AOa1vv5yKzUGs5gM7qYpQ@mail.gmail.com> (raw)
In-Reply-To: <YMUjQYtBCIxHvsYV@zeniv-ca.linux.org.uk>

On Sat, Jun 12, 2021 at 2:12 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Actually, is there any good way to make sure that write fault is triggered
> _without_ modification of the data?  On x86 lock xadd (of 0, that is) would
> probably do it and some of the other architectures could probably get away
> with using cmpxchg and its relatives, but how reliable it is wrt always
> triggering a write fault if the page is currently read-only?

I wouldn't worry about the CPU optimizing a zero 'add' away (extra
work for no gain in any normal situation).

But any architecture using 'ldl/stc' to do atomics would do it in
software for at least cmpxchg (ie just abort after the "doesn't
match").

Even on x86, it's certainly _possible_ that a non-matching cmpxchg
might not have done the writability check, although I would find that
unlikely (ie I would expect it to do just one TLB lookup, and just one
permission check, whether it then writes or not).

And if the architecture does atomic operations using that ldl/stc
model, I could (again) see the software loop breaking out early
(before the stc) on the grounds of "value didn't change".

Although it's a lot less likely outside of cmpxchg. I suspect an "add
zero" would work just fine even on a ldl/stc model.

That said, reads are obviously much easier, and I'd probably prefer
the model for writes to be to not necessarily pre-fault anything at
all, but just write to user space with page faults disabled.

And then only if that fails do you do anything special. And at that
point, even walking the page tables by hand might be perfectly
acceptable - since we know it's going to fault anyway, and it might
actually be cheaper to just do it by hand with GUP or whatever.

          Linus



  reply	other threads:[~2021-06-12 21:34 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-31 17:01 [RFC 0/9] gfs2: handle page faults during read and write Andreas Gruenbacher
2021-05-31 17:01 ` [Cluster-devel] " Andreas Gruenbacher
2021-05-31 17:01 ` [RFC 1/9] gfs2: Clean up the error handling in gfs2_page_mkwrite Andreas Gruenbacher
2021-05-31 17:01   ` [Cluster-devel] " Andreas Gruenbacher
2021-05-31 17:01 ` [RFC 2/9] gfs2: Add wrapper for iomap_file_buffered_write Andreas Gruenbacher
2021-05-31 17:01   ` [Cluster-devel] " Andreas Gruenbacher
2021-05-31 17:01 ` [RFC 3/9] gfs2: Add gfs2_holder_is_compatible helper Andreas Gruenbacher
2021-05-31 17:01   ` [Cluster-devel] " Andreas Gruenbacher
2021-05-31 17:01 ` [RFC 4/9] gfs2: Fix mmap + page fault deadlocks (part 1) Andreas Gruenbacher
2021-05-31 17:01   ` [Cluster-devel] " Andreas Gruenbacher
2021-06-01  6:00   ` Linus Torvalds
2021-06-01  6:00     ` [Cluster-devel] " Linus Torvalds
2021-06-02 11:16     ` Andreas Gruenbacher
2021-06-02 11:16       ` [Cluster-devel] " Andreas Gruenbacher
2021-06-11 16:25       ` Al Viro
2021-06-11 16:25         ` [Cluster-devel] " Al Viro
2021-06-12 21:05         ` Al Viro
2021-06-12 21:05           ` [Cluster-devel] " Al Viro
2021-06-12 21:35           ` Al Viro
2021-06-12 21:35             ` [Cluster-devel] " Al Viro
2021-06-13  8:44             ` Steven Whitehouse
2021-06-13  8:44               ` Steven Whitehouse
2021-05-31 17:01 ` [RFC 5/9] iov_iter: Add iov_iter_fault_in_writeable() Andreas Gruenbacher
2021-05-31 17:01   ` [Cluster-devel] " Andreas Gruenbacher
2021-05-31 17:12   ` Al Viro
2021-05-31 17:12     ` [Cluster-devel] " Al Viro
2021-06-12 21:12     ` Al Viro
2021-06-12 21:12       ` [Cluster-devel] " Al Viro
2021-06-12 21:33       ` Linus Torvalds [this message]
2021-06-12 21:33         ` Linus Torvalds
2021-06-12 21:47         ` Al Viro
2021-06-12 21:47           ` [Cluster-devel] " Al Viro
2021-06-12 23:17           ` Linus Torvalds
2021-06-12 23:17             ` [Cluster-devel] " Linus Torvalds
2021-06-12 23:38             ` Al Viro
2021-06-12 23:38               ` [Cluster-devel] " Al Viro
2021-05-31 17:01 ` [RFC 6/9] gfs2: Add wrappers for accessing journal_info Andreas Gruenbacher
2021-05-31 17:01   ` [Cluster-devel] " Andreas Gruenbacher
2021-05-31 17:01 ` [RFC 7/9] gfs2: Encode glock holding and retry flags in journal_info Andreas Gruenbacher
2021-05-31 17:01   ` [Cluster-devel] " Andreas Gruenbacher
2021-05-31 17:01 ` [RFC 8/9] gfs2: Add LM_FLAG_OUTER glock holder flag Andreas Gruenbacher
2021-05-31 17:01   ` [Cluster-devel] " Andreas Gruenbacher
2021-05-31 17:01 ` [RFC 9/9] gfs2: Fix mmap + page fault deadlocks (part 2) Andreas Gruenbacher
2021-05-31 17:01   ` [Cluster-devel] " Andreas Gruenbacher
2021-06-01  5:47   ` Linus Torvalds
2021-06-01  5:47     ` [Cluster-devel] " Linus Torvalds
2021-05-31 17:57 ` [Cluster-devel] [RFC 0/9] gfs2: handle page faults during read and write Linus Torvalds
2021-05-31 20:35   ` Andreas Gruenbacher
2021-05-31 20:35     ` [Cluster-devel] " Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=whcnziOWqVESWKJ6Y1_sG2S2AOa1vv5yKzUGs5gM7qYpQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=agruenba@redhat.com \
    --cc=cluster-devel@redhat.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.