LinuxPPC-Dev Archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception
@ 2024-03-13  7:26 Vaibhav Jain
  2024-03-20 13:43 ` Nicholas Piggin
  0 siblings, 1 reply; 7+ messages in thread
From: Vaibhav Jain @ 2024-03-13  7:26 UTC (permalink / raw
  To: linuxppc-dev, kvm, kvm-ppc
  Cc: mikey, sbhat, amachhiw, Jordan Niethe, gautam, Nicholas Piggin,
	David.Laight, kconsul, Vaibhav Jain, Vaidyanathan Srinivasan

This reverts commit 180c6b072bf360b686e53d893d8dcf7dbbaec6bb ("KVM: PPC:
Book3S HV nestedv2: Do not cancel pending decrementer exception") which
prevented cancelling a pending HDEC exception for nestedv2 KVM guests. It
was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'HDEC
expiry TB' register which was higher compared to handling extra decrementer
exceptions.

This overhead of reading 'HDEC expiry TB' register has been mitigated
recently by the L0 hypervisor(PowerVM) by putting the value of this
register in L2 guest-state output buffer on trap to L1. From there the
value of this register is cached, made available in kvmhv_run_single_vcpu()
to compare it against host(L1) timebase and cancel the pending hypervisor
decrementer exception if needed.

Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 arch/powerpc/kvm/book3s_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 0b921704da45..e47b954ce266 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4856,7 +4856,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 	 * entering a nested guest in which case the decrementer is now owned
 	 * by L2 and the L1 decrementer is provided in hdec_expires
 	 */
-	if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) &&
+	if (kvmppc_core_pending_dec(vcpu) &&
 			((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
 			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
 			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception
  2024-03-13  7:26 [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception Vaibhav Jain
@ 2024-03-20 13:43 ` Nicholas Piggin
  2024-04-04  8:35   ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 7+ messages in thread
From: Nicholas Piggin @ 2024-03-20 13:43 UTC (permalink / raw
  To: Vaibhav Jain, linuxppc-dev, kvm, kvm-ppc
  Cc: mikey, sbhat, amachhiw, gautam, David.Laight, kconsul,
	Jordan Niethe, Vaidyanathan Srinivasan

On Wed Mar 13, 2024 at 5:26 PM AEST, Vaibhav Jain wrote:
> This reverts commit 180c6b072bf360b686e53d893d8dcf7dbbaec6bb ("KVM: PPC:
> Book3S HV nestedv2: Do not cancel pending decrementer exception") which
> prevented cancelling a pending HDEC exception for nestedv2 KVM guests. It
> was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'HDEC
> expiry TB' register which was higher compared to handling extra decrementer
> exceptions.
>
> This overhead of reading 'HDEC expiry TB' register has been mitigated
> recently by the L0 hypervisor(PowerVM) by putting the value of this
> register in L2 guest-state output buffer on trap to L1. From there the
> value of this register is cached, made available in kvmhv_run_single_vcpu()
> to compare it against host(L1) timebase and cancel the pending hypervisor
> decrementer exception if needed.

Ah, I figured out the problem here. Guest entry never clears the
queued dec, because it's level triggered on the DEC MSB so it
doesn't go away when it's delivered. So upstream code is indeed
buggy and I think I take the blame for suggesting this nestedv2
workaround.

I actually don't think that is necessary though, we could treat it
like other interrupts.  I think that would solve the problem without
having to test dec here.

I am wondering though, what workload slows down that this patch
was needed in the first place. We'd only get here after a cede
returns, then we'd dequeue the dec and stop having to GET_STATE
it here.

Thanks,
Nick

>
> Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception")
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>  arch/powerpc/kvm/book3s_hv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 0b921704da45..e47b954ce266 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -4856,7 +4856,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>  	 * entering a nested guest in which case the decrementer is now owned
>  	 * by L2 and the L1 decrementer is provided in hdec_expires
>  	 */
> -	if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) &&
> +	if (kvmppc_core_pending_dec(vcpu) &&
>  			((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
>  			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
>  			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception
  2024-03-20 13:43 ` Nicholas Piggin
@ 2024-04-04  8:35   ` Linux regression tracking (Thorsten Leemhuis)
  2024-04-05  3:20     ` Michael Ellerman
  0 siblings, 1 reply; 7+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-04-04  8:35 UTC (permalink / raw
  To: Nicholas Piggin, Vaibhav Jain, linuxppc-dev, kvm, kvm-ppc
  Cc: mikey, Linux kernel regressions list, sbhat, amachhiw, gautam,
	David.Laight, kconsul, Jordan Niethe, Vaidyanathan Srinivasan

Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Was this regression ever resolved? Doesn't look like it, but maybe I
just missed something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 20.03.24 14:43, Nicholas Piggin wrote:
> On Wed Mar 13, 2024 at 5:26 PM AEST, Vaibhav Jain wrote:
>> This reverts commit 180c6b072bf360b686e53d893d8dcf7dbbaec6bb ("KVM: PPC:
>> Book3S HV nestedv2: Do not cancel pending decrementer exception") which
>> prevented cancelling a pending HDEC exception for nestedv2 KVM guests. It
>> was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'HDEC
>> expiry TB' register which was higher compared to handling extra decrementer
>> exceptions.
>>
>> This overhead of reading 'HDEC expiry TB' register has been mitigated
>> recently by the L0 hypervisor(PowerVM) by putting the value of this
>> register in L2 guest-state output buffer on trap to L1. From there the
>> value of this register is cached, made available in kvmhv_run_single_vcpu()
>> to compare it against host(L1) timebase and cancel the pending hypervisor
>> decrementer exception if needed.
> 
> Ah, I figured out the problem here. Guest entry never clears the
> queued dec, because it's level triggered on the DEC MSB so it
> doesn't go away when it's delivered. So upstream code is indeed
> buggy and I think I take the blame for suggesting this nestedv2
> workaround.
> 
> I actually don't think that is necessary though, we could treat it
> like other interrupts.  I think that would solve the problem without
> having to test dec here.
> 
> I am wondering though, what workload slows down that this patch
> was needed in the first place. We'd only get here after a cede
> returns, then we'd dequeue the dec and stop having to GET_STATE
> it here.
> 
> Thanks,
> Nick
> 
>>
>> Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception")
>> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> ---
>>  arch/powerpc/kvm/book3s_hv.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 0b921704da45..e47b954ce266 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -4856,7 +4856,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>>  	 * entering a nested guest in which case the decrementer is now owned
>>  	 * by L2 and the L1 decrementer is provided in hdec_expires
>>  	 */
>> -	if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) &&
>> +	if (kvmppc_core_pending_dec(vcpu) &&
>>  			((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
>>  			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
>>  			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception
  2024-04-04  8:35   ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-04-05  3:20     ` Michael Ellerman
  2024-04-05  5:54       ` Thorsten Leemhuis
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Ellerman @ 2024-04-05  3:20 UTC (permalink / raw
  To: Linux regression tracking (Thorsten Leemhuis), Nicholas Piggin,
	Vaibhav Jain, linuxppc-dev, kvm, kvm-ppc
  Cc: mikey, Linux kernel regressions list, sbhat, amachhiw,
	Jordan Niethe, gautam, David.Laight, kconsul,
	Vaidyanathan Srinivasan

"Linux regression tracking (Thorsten Leemhuis)"
<regressions@leemhuis.info> writes:
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> Was this regression ever resolved? Doesn't look like it, but maybe I
> just missed something.

I'm not sure how it ended up on the regression list. IMHO it's not
really a regression. It was an attempt at a performance optimisation,
which is no longer needed due to changes in (unreleased) firmware.

I haven't merged it because Nick's reply contained several questions for
Vaibhav, so I'm expecting either a reply to those or a new version of
the patch.

cheers

> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> #regzbot poke
>
> On 20.03.24 14:43, Nicholas Piggin wrote:
>> On Wed Mar 13, 2024 at 5:26 PM AEST, Vaibhav Jain wrote:
>>> This reverts commit 180c6b072bf360b686e53d893d8dcf7dbbaec6bb ("KVM: PPC:
>>> Book3S HV nestedv2: Do not cancel pending decrementer exception") which
>>> prevented cancelling a pending HDEC exception for nestedv2 KVM guests. It
>>> was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'HDEC
>>> expiry TB' register which was higher compared to handling extra decrementer
>>> exceptions.
>>>
>>> This overhead of reading 'HDEC expiry TB' register has been mitigated
>>> recently by the L0 hypervisor(PowerVM) by putting the value of this
>>> register in L2 guest-state output buffer on trap to L1. From there the
>>> value of this register is cached, made available in kvmhv_run_single_vcpu()
>>> to compare it against host(L1) timebase and cancel the pending hypervisor
>>> decrementer exception if needed.
>> 
>> Ah, I figured out the problem here. Guest entry never clears the
>> queued dec, because it's level triggered on the DEC MSB so it
>> doesn't go away when it's delivered. So upstream code is indeed
>> buggy and I think I take the blame for suggesting this nestedv2
>> workaround.
>> 
>> I actually don't think that is necessary though, we could treat it
>> like other interrupts.  I think that would solve the problem without
>> having to test dec here.
>> 
>> I am wondering though, what workload slows down that this patch
>> was needed in the first place. We'd only get here after a cede
>> returns, then we'd dequeue the dec and stop having to GET_STATE
>> it here.
>> 
>> Thanks,
>> Nick
>> 
>>>
>>> Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception")
>>> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>>> ---
>>>  arch/powerpc/kvm/book3s_hv.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>>> index 0b921704da45..e47b954ce266 100644
>>> --- a/arch/powerpc/kvm/book3s_hv.c
>>> +++ b/arch/powerpc/kvm/book3s_hv.c
>>> @@ -4856,7 +4856,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>>>  	 * entering a nested guest in which case the decrementer is now owned
>>>  	 * by L2 and the L1 decrementer is provided in hdec_expires
>>>  	 */
>>> -	if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) &&
>>> +	if (kvmppc_core_pending_dec(vcpu) &&
>>>  			((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
>>>  			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
>>>  			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
>> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception
  2024-04-05  3:20     ` Michael Ellerman
@ 2024-04-05  5:54       ` Thorsten Leemhuis
  2024-04-08  5:20         ` Michael Ellerman
  0 siblings, 1 reply; 7+ messages in thread
From: Thorsten Leemhuis @ 2024-04-05  5:54 UTC (permalink / raw
  To: Michael Ellerman, Nicholas Piggin, Vaibhav Jain, linuxppc-dev,
	kvm, kvm-ppc
  Cc: mikey, Linux kernel regressions list, sbhat, amachhiw,
	Jordan Niethe, gautam, David.Laight, kconsul,
	Vaidyanathan Srinivasan

On 05.04.24 05:20, Michael Ellerman wrote:
> "Linux regression tracking (Thorsten Leemhuis)"
> <regressions@leemhuis.info> writes:
>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>> for once, to make this easily accessible to everyone.
>>
>> Was this regression ever resolved? Doesn't look like it, but maybe I
>> just missed something.
> 
> I'm not sure how it ended up on the regression list.

That is easy to explain: I let lei search for mails containing words
like regress, bisect, and revert to become aware of regressions that
might need tracking. And...

> IMHO it's not really a regression.

...sometimes I misjudge or misinterpret something and add it to the
regression tracking. Looks like that happened here.

Sorry for that and the noise it caused!

#regzbot resolve: invalid: was not really a regression in the first place

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception
  2024-04-05  5:54       ` Thorsten Leemhuis
@ 2024-04-08  5:20         ` Michael Ellerman
  2024-04-10  4:28           ` Nicholas Piggin
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Ellerman @ 2024-04-08  5:20 UTC (permalink / raw
  To: Thorsten Leemhuis, Nicholas Piggin, Vaibhav Jain, linuxppc-dev,
	kvm, kvm-ppc
  Cc: mikey, Linux kernel regressions list, sbhat, amachhiw,
	Jordan Niethe, gautam, David.Laight, kconsul,
	Vaidyanathan Srinivasan

Thorsten Leemhuis <regressions@leemhuis.info> writes:
> On 05.04.24 05:20, Michael Ellerman wrote:
>> "Linux regression tracking (Thorsten Leemhuis)"
>> <regressions@leemhuis.info> writes:
>>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>>> for once, to make this easily accessible to everyone.
>>>
>>> Was this regression ever resolved? Doesn't look like it, but maybe I
>>> just missed something.
>> 
>> I'm not sure how it ended up on the regression list.
>
> That is easy to explain: I let lei search for mails containing words
> like regress, bisect, and revert to become aware of regressions that
> might need tracking. And...
>
>> IMHO it's not really a regression.
>
> ...sometimes I misjudge or misinterpret something and add it to the
> regression tracking. Looks like that happened here.
>
> Sorry for that and the noise it caused!

No worries.

cheers

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception
  2024-04-08  5:20         ` Michael Ellerman
@ 2024-04-10  4:28           ` Nicholas Piggin
  0 siblings, 0 replies; 7+ messages in thread
From: Nicholas Piggin @ 2024-04-10  4:28 UTC (permalink / raw
  To: Michael Ellerman, Thorsten Leemhuis, Vaibhav Jain, linuxppc-dev,
	kvm, kvm-ppc
  Cc: mikey, Linux kernel regressions list, sbhat, amachhiw,
	Jordan Niethe, gautam, David.Laight, kconsul,
	Vaidyanathan Srinivasan

On Mon Apr 8, 2024 at 3:20 PM AEST, Michael Ellerman wrote:
> Thorsten Leemhuis <regressions@leemhuis.info> writes:
> > On 05.04.24 05:20, Michael Ellerman wrote:
> >> "Linux regression tracking (Thorsten Leemhuis)"
> >> <regressions@leemhuis.info> writes:
> >>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> >>> for once, to make this easily accessible to everyone.
> >>>
> >>> Was this regression ever resolved? Doesn't look like it, but maybe I
> >>> just missed something.
> >> 
> >> I'm not sure how it ended up on the regression list.
> >
> > That is easy to explain: I let lei search for mails containing words
> > like regress, bisect, and revert to become aware of regressions that
> > might need tracking. And...
> >
> >> IMHO it's not really a regression.
> >
> > ...sometimes I misjudge or misinterpret something and add it to the
> > regression tracking. Looks like that happened here.
> >
> > Sorry for that and the noise it caused!
>
> No worries.

It is actually a regression. It only really affects performance,
but the logic is broken (my fault). We were going to revert it, and
solve the initial regression a better way.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-04-10  4:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-13  7:26 [PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception Vaibhav Jain
2024-03-20 13:43 ` Nicholas Piggin
2024-04-04  8:35   ` Linux regression tracking (Thorsten Leemhuis)
2024-04-05  3:20     ` Michael Ellerman
2024-04-05  5:54       ` Thorsten Leemhuis
2024-04-08  5:20         ` Michael Ellerman
2024-04-10  4:28           ` Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).