From: David Woodhouse <dwmw2@infradead.org>
To: "9377995a-26a4-2523-e421-be1cd92bdc34@oracle.com"
<9377995a-26a4-2523-e421-be1cd92bdc34@oracle.com>,
"dongli.zhang@oracle.com" <dongli.zhang@oracle.com>
Cc: "corbet@lwn.net" <corbet@lwn.net>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"shuah@kernel.org" <shuah@kernel.org>,
"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"seanjc@google.com" <seanjc@google.com>,
"mingo@redhat.com" <mingo@redhat.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"hpa@zytor.com" <hpa@zytor.com>, "paul@xen.org" <paul@xen.org>,
"bp@alien8.de" <bp@alien8.de>,
"linux-kselftest@vger.kernel.org"
<linux-kselftest@vger.kernel.org>,
"Allister, Jack" <jalliste@amazon.co.uk>,
"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH 2/2] KVM: selftests: Add KVM/PV clock selftest to prove timer drift correction
Date: Thu, 11 Apr 2024 14:28:37 +0100 [thread overview]
Message-ID: <f6c9c757a1f4eb2d3c7ce06d27827e099923ef6b.camel@infradead.org> (raw)
In-Reply-To: <4f1ca4e1a8a9a31eae8057f9a813fc13d3172f77.camel@amazon.co.uk>
[-- Attachment #1: Type: text/plain, Size: 3792 bytes --]
On Wed, 2024-04-10 at 10:15 +0000, Allister, Jack wrote:
> > AFAIR, I copied check_clocksource() from existing code during that >
> time.
>
> > The commit e440c5f2e ("KVM: selftests: Generalize check_clocksource()
> > from kvm_clock_test") has introduced sys_clocksource_is_tsc(). Later
> > it is renamed to sys_clocksource_is_based_on_tsc().
> > Any chance to re-use sys_clocksource_is_based_on_tsc()?
>
> Yes I'm more than happy to change it to that. I was using your original
> mail as a reference and did not realise there was a utility present for
> this.
>
> > Is configure_scaled_tsc() anecessary? Or how about to make it an >
> option/arg?
> > Then I will be able to test it on a VM/server without TSC scaling.
>
> So if TSC scaling from 3GHz (host) -> 1.5GHz (guest) I do see a skew of
> ~3500ns after the update. Where as without scaling a delta can be seen
> but is roughly ~180ns.
I don't think it's as simple as "TSC scaling makes the drift larger".
I suspect that's just the way the arithmetic precision works out for
those frequencies. With other frequencies of host and guest you might
find that it works out closer *with* the scaling.
Consider a graph of "time" in the Y axis, against the host TSC as the X
axis. As an example, let's assume the host has a TSC frequency of 3GHz.
Each of the three definitions of the KVM clock (A based on
CLOCK_MONOTONIC_RAW, B based on the guest TSC, C based directly on the
host TSC) will have a gradient of *roughly* 1 ns per three ticks.
Due to arithmetic precision, the gradient of each is going to vary
slightly. We hope that CLOCK_MONOTONIC_RAW is going to do the best, as
the other two are limited by the precision of the pvclock ABI that's
exposed to the guest. You can use http://david.woodhou.se/tsdrift.c to
see where the latter two land, for different TSC frequencies.
$ ./tsdrift 2500000000 3000000000 | tail -1
TSC 259200000000000, guest TSC 215999999979883, guest ns 86399999971836 host ns 86399999979883 (delta -8047)
$ ./tsdrift 2700000000 3000000000 | tail -1
TSC 259200000000000, guest TSC 233279999975860, guest ns 86399999983012 host ns 86399999979883 (delta 3129)
So after a day, let's assume CLOCK_MONOTONIC_RAW will have advanced by
86400 seconds. The KVM clock based on the host TSC will be 20µs slow,
while a KVM clock based on a guest TSC frequency of 2.5GHz would be an
*additional* 8µs slower. But a guest TSC frequency of 2.7GHz would
actually run *faster* than the host-based one, and would only be 17µs
behind reality.
Your test is measuring how *much* the host CLOCK_MONOTONIC_RAW (my
definition A) drifts from definition B which is derived from the guest
TSC.
It demonstrates the discontinuity that KVM_REQ_MASTERCLOCK_UPDATE
introduces, by clamping the KVM clock back to the 'definition A' line.
Fixing that is in the TODO list I shared. Basically it involves
realising that in use_master_clock mode, the delta between the KVM
clock and CLOCK_MONOTONIC_RAW (ka->kvmclock_offset) is *varying* over
time. So instead of just blindly using kvmclock_offset, we should
*recalculate* it in precisely the way that your KVM_SET_CLOCK_GUEST
does.
Having said all that... scaling from 3GHz to 1.5GHz *doesn't* lose any
precision; it shouldn't make any difference. But I guess your host TSC
isn't *really* 3GHz, it's measured against the PIT or something awful,
and comes out at a shade above or below 3GHz, leading to a more
interesting scaling factor?
> In V2 I've adjusted the test so that now by default scaling won't take
> place, however if someone wants to test with it enabled they can pass
> "-s/--scale-tsc" to induce the greater delta.
Please do it automatically based on the availability of the feature.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]
next prev parent reply other threads:[~2024-04-11 13:28 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-08 22:07 [PATCH 0/2] Add API to correct KVM/PV clock drift Jack Allister
2024-04-08 22:07 ` [PATCH 1/2] KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for KVM clock drift fixup Jack Allister
2024-04-09 0:34 ` Dongli Zhang
2024-04-09 3:50 ` David Woodhouse
2024-04-10 10:08 ` Allister, Jack
2024-04-08 22:07 ` [PATCH 2/2] KVM: selftests: Add KVM/PV clock selftest to prove timer drift correction Jack Allister
2024-04-09 0:43 ` Dongli Zhang
2024-04-09 4:23 ` David Woodhouse
2024-04-10 10:15 ` Allister, Jack
2024-04-11 13:28 ` David Woodhouse [this message]
2024-04-19 17:13 ` Chen, Zide
[not found] ` <17F1A2E9-6BAD-40E7-ACDD-B110CFC124B3@infradead.org>
2024-04-19 18:43 ` David Woodhouse
2024-04-19 23:54 ` Chen, Zide
2024-04-20 10:32 ` David Woodhouse
2024-04-20 16:03 ` David Woodhouse
2024-04-22 22:02 ` Chen, Zide
2024-04-23 7:49 ` David Woodhouse
2024-04-23 17:59 ` Chen, Zide
2024-04-23 21:02 ` David Woodhouse
2024-04-24 12:58 ` David Woodhouse
2024-04-19 19:34 ` David Woodhouse
2024-04-19 23:53 ` Chen, Zide
2024-04-10 9:52 ` [PATCH v2 0/2] Add API for accurate KVM/PV clock migration Jack Allister
2024-04-10 9:52 ` [PATCH v2 1/2] KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM " Jack Allister
2024-04-10 10:29 ` Paul Durrant
2024-04-10 12:09 ` David Woodhouse
2024-04-10 12:43 ` Paul Durrant
2024-04-17 19:50 ` David Woodhouse
2024-04-15 7:16 ` David Woodhouse
2024-04-10 9:52 ` [PATCH v2 2/2] KVM: selftests: Add KVM/PV clock selftest to prove timer correction Jack Allister
2024-04-10 10:36 ` Paul Durrant
2024-04-12 8:19 ` Dongli Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f6c9c757a1f4eb2d3c7ce06d27827e099923ef6b.camel@infradead.org \
--to=dwmw2@infradead.org \
--cc=9377995a-26a4-2523-e421-be1cd92bdc34@oracle.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=dongli.zhang@oracle.com \
--cc=hpa@zytor.com \
--cc=jalliste@amazon.co.uk \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paul@xen.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=shuah@kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).