From: "Yang, Weijiang" <weijiang.yang@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: <pbonzini@redhat.com>, <kvm@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <chao.gao@intel.com>,
<rick.p.edgecombe@intel.com>, <mlevitsk@redhat.com>,
<john.allen@amd.com>, Aaron Lewis <aaronlewis@google.com>,
Jim Mattson <jmattson@google.com>,
Oliver Upton <oupton@google.com>,
Mingwei Zhang <mizhang@google.com>
Subject: Re: [PATCH v10 20/27] KVM: VMX: Emulate read and write to CET MSRs
Date: Wed, 13 Mar 2024 17:43:12 +0800 [thread overview]
Message-ID: <9f820b96-0e4b-4cdc-93ff-f63aed829f0d@intel.com> (raw)
In-Reply-To: <ZfDdS8rtVtyEr0UR@google.com>
On 3/13/2024 6:55 AM, Sean Christopherson wrote:
> -non-KVM people, +Mingwei, Aaron, Oliver, and Jim
>
> On Sun, Feb 18, 2024, Yang Weijiang wrote:
>> case MSR_IA32_PERF_CAPABILITIES:
>> if (data && !vcpu_to_pmu(vcpu)->version)
>> return 1;
> Ha, perfect, this is already in the diff context.
>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index c0ed69353674..281c3fe728c5 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1849,6 +1849,36 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type)
>> }
>> EXPORT_SYMBOL_GPL(kvm_msr_allowed);
>>
>> +#define CET_US_RESERVED_BITS GENMASK(9, 6)
>> +#define CET_US_SHSTK_MASK_BITS GENMASK(1, 0)
>> +#define CET_US_IBT_MASK_BITS (GENMASK_ULL(5, 2) | GENMASK_ULL(63, 10))
>> +#define CET_US_LEGACY_BITMAP_BASE(data) ((data) >> 12)
>> +
>> +static bool is_set_cet_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u64 data,
>> + bool host_initiated)
>> +{
> ...
>
>> + /*
>> + * If KVM supports the MSR, i.e. has enumerated the MSR existence to
>> + * userspace, then userspace is allowed to write '0' irrespective of
>> + * whether or not the MSR is exposed to the guest.
>> + */
>> + if (!host_initiated || data)
>> + return false;
> ...
>
>> @@ -1951,6 +2017,20 @@ static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
>> !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
>> return 1;
>> break;
>> + case MSR_IA32_U_CET:
>> + case MSR_IA32_S_CET:
>> + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) &&
>> + !guest_can_use(vcpu, X86_FEATURE_IBT))
>> + return 1;
> As pointed out by Mingwei in a conversation about PERF_CAPABILITIES, rejecting
> host *reads* while allowing host writes of '0' is inconsistent. Which, while
> arguably par for the course for KVM's ABI, will likely result in the exact problem
> we're trying to avoid: killing userspace because it attempts to access an MSR KVM
> has said exists.
Thank you for the notification!
Agree on it.
>
> PERF_CAPABILITIES has a similar, but opposite, problem where KVM returns a non-zero
> value on reads, but rejects that same non-zero value on write. PERF_CAPABILITIES
> is even more complicated because KVM stuff a non-zero value at vCPU creation, but
> that's not really relevant to this discussion, just another data point for how
> messed up this all is.
>
> Also relevant to this discussion are KVM's PV MSRs, e.g. MSR_KVM_ASYNC_PF_ACK,
> as KVM rejects attempts to write '0' if the guest doesn't support the MSR, but
> if and only userspace has enabled KVM_CAP_ENFORCE_PV_FEATURE_CPUID.
>
> Coming to the point, this mess is getting too hard to maintain, both from a code
> perspective and "what is KVM's ABI?" perspective.
>
> Rather than play whack-a-mole and inevitably end up with bugs and/or inconsistencies,
> what if we (a) return KVM_MSR_RET_INVALID when an MSR access is denied based on
> guest CPUID,
Can we define a new return value KVM_MSR_RET_REJECTED for this case in order to tell it from KVM_MSR_RET_INVALID which means the msr index doesn't exit?
> (b) wrap userspace MSR accesses at the very top level and convert
> KVM_MSR_RET_INVALID to "success" when KVM reported the MSR as savable and userspace
> is reading or writing '0',
Yes, this can limit the change on KVM side.
> and (c) drop all of the host_initiated checks that
> exist purely to exempt userspace access from guest CPUID checks.
>
> The only possible hiccup I can think of is that this could subtly break userspace
> that is setting CPUID _after_ MSRs, but my understanding is that we've agreed to
> draw a line and say that that's unsupported.
Yeah, it would mess up things.
> And I think it's low risk, because
> I don't see how code like this:
>
> case MSR_TSC_AUX:
> if (!kvm_is_supported_user_return_msr(MSR_TSC_AUX))
> return 1;
>
> if (!host_initiated &&
> !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> return 1;
>
> if (guest_cpuid_is_intel(vcpu) && (data >> 32) != 0)
> return 1;
>
> can possibly work if userspace sets MSRs first. The RDTSCP/RDPID checks are
> exempt, but the vendor in guest CPUID would be '0', not Intel's magic string,
> and so setting MSRs before CPUID would fail, at least if the target vCPU model
> is Intel.
>
> P.S. I also want to rename KVM_MSR_RET_INVALID => KVM_MSR_RET_UNSUPPORTED, because
> I can never remember that "invalid" doesn't mean the value was invalid, it means
> the MSR index was invalid.
So do I :-)
>
> It'll take a few patches, but I believe we can end up with something like this:
>
> static bool kvm_is_msr_to_save(u32 msr_index)
> {
> unsigned int i;
>
> for (i = 0; i < num_msrs_to_save; i++) {
> if (msrs_to_save[i] == msr_index)
> return true;
> }
Should we also check emulated_msrs list here since KVM_GET_MSR_INDEX_LIST exposes it too?
>
> return false;
> }
> typedef int (*msr_uaccess_t)(struct kvm_vcpu *vcpu, u32 index, u64 *data,
> bool host_initiated);
>
> static __always_inline int kvm_do_msr_uaccess(struct kvm_vcpu *vcpu, u32 msr,
> u64 *data, bool host_initiated,
> enum kvm_msr_access rw,
> msr_uaccess_t msr_uaccess_fn)
> {
> const char *op = rw == MSR_TYPE_W ? "wrmsr" : "rdmsr";
> int ret;
>
> BUILD_BUG_ON(rw != MSR_TYPE_R && rw != MSR_TYPE_W);
>
> /*
> * Zero the data on read failures to avoid leaking stack data to the
> * guest and/or userspace, e.g. if the failure is ignored below.
> */
> ret = msr_uaccess_fn(vcpu, msr, data, host_initiated);
> if (ret && rw == MSR_TYPE_R)
> *data = 0;
>
> if (ret != KVM_MSR_RET_UNSUPPORTED)
> return ret;
>
> /*
> * Userspace is allowed to read MSRs, and write '0' to MSRs, that KVM
> * reports as to-be-saved, even if an MSRs isn't fully supported.
> * Simply check that @data is '0', which covers both the write '0' case
> * and all reads (in which case @data is zeroed on failure; see above).
> */
> if (kvm_is_msr_to_save(msr) && !*data)
> return 0;
>
> if (!ignore_msrs) {
> kvm_debug_ratelimited("unhandled %s: 0x%x data 0x%llx\n",
> op, msr, *data);
> return ret;
> }
>
> if (report_ignored_msrs)
> kvm_pr_unimpl("ignored %s: 0x%x data 0x%llx\n", op, msr, *data);
>
> return 0;
> }
The handling flow looks good to me. Thanks a lot!
next prev parent reply other threads:[~2024-03-13 9:43 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-19 7:47 [PATCH v10 00/27] Enable CET Virtualization Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 01/27] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 02/27] x86/fpu/xstate: Refine CET user xstate bit enabling Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 03/27] x86/fpu/xstate: Add CET supervisor mode state support Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 04/27] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set Yang Weijiang
2024-05-01 18:45 ` Sean Christopherson
2024-05-02 17:46 ` Dave Hansen
2024-05-07 22:57 ` Sean Christopherson
2024-05-07 23:17 ` Dave Hansen
2024-05-08 1:19 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 05/27] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 06/27] x86/fpu/xstate: Create guest fpstate with guest specific config Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 07/27] x86/fpu/xstate: Warn if kernel dynamic xfeatures detected in normal fpstate Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 08/27] KVM: x86: Rework cpuid_get_supported_xcr0() to operate on vCPU data Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 09/27] KVM: x86: Rename kvm_{g,s}et_msr()* to menifest emulation operations Yang Weijiang
2024-05-01 18:54 ` Sean Christopherson
2024-05-06 5:58 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 10/27] KVM: x86: Refine xsave-managed guest register/MSR reset handling Yang Weijiang
2024-02-20 3:04 ` Chao Gao
2024-02-20 13:23 ` Yang, Weijiang
2024-05-01 20:40 ` Sean Christopherson
2024-05-06 7:26 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 11/27] KVM: x86: Add kvm_msr_{read,write}() helpers Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 12/27] KVM: x86: Report XSS as to-be-saved if there are supported features Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 13/27] KVM: x86: Refresh CPUID on write to guest MSR_IA32_XSS Yang Weijiang
2024-02-20 8:51 ` Chao Gao
2024-05-01 20:43 ` Sean Christopherson
2024-05-06 7:30 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 14/27] KVM: x86: Initialize kvm_caps.supported_xss Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 15/27] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 16/27] KVM: x86: Add fault checks for guest CR4.CET setting Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 17/27] KVM: x86: Report KVM supported CET MSRs as to-be-saved Yang Weijiang
2024-05-01 22:40 ` Sean Christopherson
2024-05-06 8:31 ` Yang, Weijiang
2024-05-07 17:27 ` Sean Christopherson
2024-05-08 7:00 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 18/27] KVM: VMX: Introduce CET VMCS fields and control bits Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 19/27] KVM: x86: Use KVM-governed feature framework to track "SHSTK/IBT enabled" Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 20/27] KVM: VMX: Emulate read and write to CET MSRs Yang Weijiang
2024-03-12 22:55 ` Sean Christopherson
2024-03-13 9:43 ` Yang, Weijiang [this message]
2024-03-13 16:00 ` Sean Christopherson
2024-02-19 7:47 ` [PATCH v10 21/27] KVM: x86: Save and reload SSP to/from SMRAM Yang Weijiang
2024-05-01 22:50 ` Sean Christopherson
2024-05-06 8:41 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 22/27] KVM: VMX: Set up interception for CET MSRs Yang Weijiang
2024-05-01 23:07 ` Sean Christopherson
2024-05-06 8:48 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 23/27] KVM: VMX: Set host constant supervisor states to VMCS fields Yang Weijiang
2024-02-19 7:47 ` [PATCH v10 24/27] KVM: x86: Enable CET virtualization for VMX and advertise to userspace Yang Weijiang
2024-05-01 23:15 ` Sean Christopherson
2024-05-01 23:24 ` Edgecombe, Rick P
2024-05-06 9:19 ` Yang, Weijiang
2024-05-06 16:54 ` Sean Christopherson
2024-05-07 2:37 ` Yang, Weijiang
2024-05-06 17:05 ` Edgecombe, Rick P
2024-05-06 23:33 ` Sean Christopherson
2024-05-06 23:53 ` Edgecombe, Rick P
2024-05-07 14:21 ` Sean Christopherson
2024-05-07 14:45 ` Edgecombe, Rick P
2024-05-07 15:08 ` Sean Christopherson
2024-05-07 15:33 ` Edgecombe, Rick P
2024-05-16 7:13 ` Yang, Weijiang
2024-05-16 14:39 ` Sean Christopherson
2024-05-16 15:36 ` Dave Hansen
2024-05-16 16:58 ` Sean Christopherson
2024-05-17 8:27 ` Yang, Weijiang
2024-05-17 8:57 ` Thomas Gleixner
2024-05-17 14:26 ` Sean Christopherson
2024-05-20 9:43 ` Yang, Weijiang
2024-05-20 17:09 ` Sean Christopherson
2024-05-20 17:15 ` Dave Hansen
2024-05-22 9:03 ` Yang, Weijiang
2024-05-22 15:06 ` Edgecombe, Rick P
2024-05-23 10:07 ` Yang, Weijiang
2024-05-22 8:41 ` Yang, Weijiang
2024-05-27 9:05 ` Yang, Weijiang
2024-05-01 23:34 ` Sean Christopherson
2024-05-06 9:41 ` Yang, Weijiang
2024-05-16 7:20 ` Yang, Weijiang
2024-05-16 14:43 ` Sean Christopherson
2024-05-17 8:04 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 25/27] KVM: nVMX: Introduce new VMX_BASIC bit for event error_code delivery to L1 Yang Weijiang
2024-05-01 23:19 ` Sean Christopherson
2024-05-06 9:19 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 26/27] KVM: nVMX: Enable CET support for nested guest Yang Weijiang
2024-05-01 23:23 ` Sean Christopherson
2024-05-06 9:25 ` Yang, Weijiang
2024-02-19 7:47 ` [PATCH v10 27/27] KVM: x86: Don't emulate instructions guarded by CET Yang Weijiang
2024-05-01 23:24 ` Sean Christopherson
2024-05-06 9:26 ` Yang, Weijiang
2024-03-06 14:44 ` [PATCH v10 00/27] Enable CET Virtualization Yang, Weijiang
2024-05-01 23:27 ` Sean Christopherson
2024-05-06 9:31 ` Yang, Weijiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9f820b96-0e4b-4cdc-93ff-f63aed829f0d@intel.com \
--to=weijiang.yang@intel.com \
--cc=aaronlewis@google.com \
--cc=chao.gao@intel.com \
--cc=jmattson@google.com \
--cc=john.allen@amd.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=mlevitsk@redhat.com \
--cc=oupton@google.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).