From: Paolo Bonzini <pbonzini@redhat.com>
To: David Matlack <dmatlack@google.com>
Cc: kvm@vger.kernel.org, kvm-ppc@vger.kernel.org,
Ben Gardon <bgardon@google.com>, Joerg Roedel <joro@8bytes.org>,
Jim Mattson <jmattson@google.com>,
Wanpeng Li <wanpengli@tencent.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Junaid Shahid <junaids@google.com>,
Andrew Jones <drjones@redhat.com>,
Paul Mackerras <paulus@ozlabs.org>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Janosch Frank <frankja@linux.ibm.com>
Subject: Re: [PATCH v2 0/7] Improve gfn-to-memslot performance during page faults
Date: Thu, 05 Aug 2021 08:11:00 +0000 [thread overview]
Message-ID: <8a795ab3-d504-b0fd-447c-12117fb598c1@redhat.com> (raw)
In-Reply-To: <20210804222844.1419481-1-dmatlack@google.com>
On 05/08/21 00:28, David Matlack wrote:
> This series improves the performance of gfn-to-memslot lookups during
> page faults. Ben Gardon originally identified this performance gap and
> sufficiently addressed it in Google's kernel by reading the memslot once
> at the beginning of the page fault and passing around the pointer.
>
> This series takes an alternative approach by introducing a per-vCPU
> cache of the least recently used memslot index. This avoids needing to
> binary search the existing memslots multiple times during a page fault.
> Unlike passing around the pointer, the cache has an additional benefit
> in that it speeds up gfn-to-memslot lookups *across* faults and during
> spte prefetching where the gfn changes.
>
> This difference can be seen clearly when looking at the performance of
> fast_page_fault when multiple slots are in play:
>
> Metric | Baseline | Pass* | Cache**
> ----------------------------- | ------------ | -------- | ----------
> Iteration 2 dirty memory time | 2.8s | 1.6s | 0.30s
>
> * Pass: Lookup the memslot once per fault and pass it around.
> ** Cache: Cache the last used slot per vCPU (i.e. this series).
>
> (Collected via ./dirty_log_perf_test -v64 -x64)
>
> I plan to also send a follow-up series with a version of Ben's patches
> to pass the pointer to the memslot through the page fault handling code
> rather than looking it up multiple times. Even when applied on top of
> the cache series it has some performance improvements by avoiding a few
> extra memory accesses (mainly kvm->memslots[as_id] and
> slots->used_slots). But it will be a judgement call whether or not it's
> worth the code churn and complexity.
Queued, thanks.
Paolo
> v2:
> * Rename lru to last_used [Paolo]
> * Tree-wide replace search_memslots with __gfn_to_memslot [Paolo]
> * Avoid speculation when accessesing slots->memslots [Paolo]
> * Refactor tdp_set_spte_atomic to leverage vcpu->last_used_slot [Paolo]
> * Add Paolo's Reviewed-by tags
> * Fix build failures in mmu_audit.c [kernel test robot]
>
> v1: https://lore.kernel.org/kvm/20210730223707.4083785-1-dmatlack@google.com/
>
> David Matlack (7):
> KVM: Rename lru_slot to last_used_slot
> KVM: Move last_used_slot logic out of search_memslots
> KVM: Cache the last used slot index per vCPU
> KVM: x86/mmu: Leverage vcpu->last_used_slot in
> tdp_mmu_map_handle_target_level
> KVM: x86/mmu: Leverage vcpu->last_used_slot for rmap_add and
> rmap_recycle
> KVM: x86/mmu: Rename __gfn_to_rmap to gfn_to_rmap
> KVM: selftests: Support multiple slots in dirty_log_perf_test
>
> arch/powerpc/kvm/book3s_64_vio.c | 2 +-
> arch/powerpc/kvm/book3s_64_vio_hv.c | 2 +-
> arch/s390/kvm/kvm-s390.c | 4 +-
> arch/x86/kvm/mmu/mmu.c | 54 +++++++------
> arch/x86/kvm/mmu/mmu_audit.c | 4 +-
> arch/x86/kvm/mmu/tdp_mmu.c | 42 +++++++---
> include/linux/kvm_host.h | 80 +++++++++++++++----
> .../selftests/kvm/access_tracking_perf_test.c | 2 +-
> .../selftests/kvm/demand_paging_test.c | 2 +-
> .../selftests/kvm/dirty_log_perf_test.c | 76 +++++++++++++++---
> .../selftests/kvm/include/perf_test_util.h | 2 +-
> .../selftests/kvm/lib/perf_test_util.c | 20 +++--
> .../kvm/memslot_modification_stress_test.c | 2 +-
> virt/kvm/kvm_main.c | 26 +++++-
> 14 files changed, 238 insertions(+), 80 deletions(-)
>
prev parent reply other threads:[~2021-08-05 8:11 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-04 22:28 [PATCH v2 0/7] Improve gfn-to-memslot performance during page faults David Matlack
2021-08-04 22:28 ` [PATCH v2 1/7] KVM: Rename lru_slot to last_used_slot David Matlack
2021-08-04 22:28 ` [PATCH v2 2/7] KVM: Move last_used_slot logic out of search_memslots David Matlack
2021-08-04 22:28 ` [PATCH v2 3/7] KVM: Cache the last used slot index per vCPU David Matlack
2021-08-04 22:28 ` [PATCH v2 4/7] KVM: x86/mmu: Leverage vcpu->last_used_slot in tdp_mmu_map_handle_target_level David Matlack
2021-08-04 22:28 ` [PATCH v2 5/7] KVM: x86/mmu: Leverage vcpu->last_used_slot for rmap_add and rmap_recycle David Matlack
2021-08-04 22:28 ` [PATCH v2 6/7] KVM: x86/mmu: Rename __gfn_to_rmap to gfn_to_rmap David Matlack
2021-08-04 22:28 ` [PATCH v2 7/7] KVM: selftests: Support multiple slots in dirty_log_perf_test David Matlack
2021-08-05 8:11 ` Paolo Bonzini [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8a795ab3-d504-b0fd-447c-12117fb598c1@redhat.com \
--to=pbonzini@redhat.com \
--cc=bgardon@google.com \
--cc=borntraeger@de.ibm.com \
--cc=dmatlack@google.com \
--cc=drjones@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=junaids@google.com \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=paulus@ozlabs.org \
--cc=seanjc@google.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).