From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org, linux-s390@vger.kernel.org,
David Hildenbrand <david@redhat.com>,
Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Sven Schnelle <svens@linux.ibm.com>,
Janosch Frank <frankja@linux.ibm.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
Matthew Wilcox <willy@infradead.org>,
Thomas Huth <thuth@redhat.com>
Subject: [PATCH v3 03/10] s390/uv: split large folios in gmap_make_secure()
Date: Wed, 8 May 2024 20:29:48 +0200 [thread overview]
Message-ID: <20240508182955.358628-4-david@redhat.com> (raw)
In-Reply-To: <20240508182955.358628-1-david@redhat.com>
While s390x makes sure to never have PMD-mapped THP in processes that use
KVM -- by remapping them using PTEs in
thp_split_walk_pmd_entry()->split_huge_pmd() -- there is still the
possibility of having PTE-mapped THPs (large folios) mapped into guest
memory.
This would happen if user space allocates memory before calling
KVM_CREATE_VM (which would call s390_enable_sie()). With upstream QEMU,
this currently doesn't happen, because guest memory is setup and
conditionally preallocated after KVM_CREATE_VM.
Could it happen with shmem/file-backed memory when another process
allocated memory in the pagecache? Likely, although currently not a
common setup.
Trying to split any PTE-mapped large folios sounds like the right and
future-proof thing to do here. So let's call split_folio() and handle the
return values accordingly.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
arch/s390/kernel/uv.c | 31 +++++++++++++++++++++++++------
1 file changed, 25 insertions(+), 6 deletions(-)
diff --git a/arch/s390/kernel/uv.c b/arch/s390/kernel/uv.c
index 25fe28d189df..3c6d86e3e828 100644
--- a/arch/s390/kernel/uv.c
+++ b/arch/s390/kernel/uv.c
@@ -338,11 +338,10 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
goto out;
if (pte_present(*ptep) && !(pte_val(*ptep) & _PAGE_INVALID) && pte_write(*ptep)) {
folio = page_folio(pte_page(*ptep));
- rc = -EINVAL;
- if (folio_test_large(folio))
- goto unlock;
rc = -EAGAIN;
- if (folio_trylock(folio)) {
+ if (folio_test_large(folio)) {
+ rc = -E2BIG;
+ } else if (folio_trylock(folio)) {
if (should_export_before_import(uvcb, gmap->mm))
uv_convert_from_secure(PFN_PHYS(folio_pfn(folio)));
rc = make_folio_secure(folio, uvcb);
@@ -353,15 +352,35 @@ int gmap_make_secure(struct gmap *gmap, unsigned long gaddr, void *uvcb)
* Once we drop the PTL, the folio may get unmapped and
* freed immediately. We need a temporary reference.
*/
- if (rc == -EAGAIN)
+ if (rc == -EAGAIN || rc == -E2BIG)
folio_get(folio);
}
-unlock:
pte_unmap_unlock(ptep, ptelock);
out:
mmap_read_unlock(gmap->mm);
switch (rc) {
+ case -E2BIG:
+ folio_lock(folio);
+ rc = split_folio(folio);
+ folio_unlock(folio);
+ folio_put(folio);
+
+ switch (rc) {
+ case 0:
+ /* Splitting succeeded, try again immediately. */
+ goto again;
+ case -EAGAIN:
+ /* Additional folio references. */
+ if (drain_lru(&drain_lru_called))
+ goto again;
+ return -EAGAIN;
+ case -EBUSY:
+ /* Unexpected race. */
+ return -EAGAIN;
+ }
+ WARN_ON_ONCE(1);
+ return -ENXIO;
case -EAGAIN:
/*
* If we are here because the UVC returned busy or partial
--
2.45.0
next prev parent reply other threads:[~2024-05-08 18:30 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-08 18:29 [PATCH v3 00/10] s390: PG_arch_1+folio cleanups for uv+hugetlb David Hildenbrand
2024-05-08 18:29 ` [PATCH v3 01/10] s390/uv: don't call folio_wait_writeback() without a folio reference David Hildenbrand
2024-05-08 18:29 ` [PATCH v3 02/10] s390/uv: gmap_make_secure() cleanups for further changes David Hildenbrand
2024-05-08 18:29 ` David Hildenbrand [this message]
2024-05-08 18:29 ` [PATCH v3 04/10] s390/uv: convert PG_arch_1 users to only work on small folios David Hildenbrand
2024-05-08 18:29 ` [PATCH v3 05/10] s390/uv: update PG_arch_1 comment David Hildenbrand
2024-05-08 18:29 ` [PATCH v3 06/10] s390/uv: make uv_convert_from_secure() a static function David Hildenbrand
2024-05-08 18:29 ` [PATCH v3 07/10] s390/uv: convert uv_destroy_owned_page() to uv_destroy_(folio|pte)() David Hildenbrand
2024-05-08 18:29 ` [PATCH v3 08/10] s390/uv: convert uv_convert_owned_from_secure() to uv_convert_from_secure_(folio|pte)() David Hildenbrand
2024-05-08 18:29 ` [PATCH v3 09/10] s390/uv: implement HAVE_ARCH_MAKE_FOLIO_ACCESSIBLE David Hildenbrand
2024-05-08 18:29 ` [PATCH v3 10/10] s390/hugetlb: convert PG_arch_1 code to work on folio->flags David Hildenbrand
2024-05-09 15:04 ` [PATCH v3 00/10] s390: PG_arch_1+folio cleanups for uv+hugetlb Heiko Carstens
2024-05-09 17:26 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240508182955.358628-4-david@redhat.com \
--to=david@redhat.com \
--cc=agordeev@linux.ibm.com \
--cc=borntraeger@linux.ibm.com \
--cc=frankja@linux.ibm.com \
--cc=gerald.schaefer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=svens@linux.ibm.com \
--cc=thuth@redhat.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).