Linux Kernel Mentees Archive mirror
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Juntong Deng <juntong.deng@outlook.com>
Cc: glider@google.com, dvyukov@google.com, akpm@linux-foundation.org,
	 kasan-dev@googlegroups.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	 linux-kernel-mentees@lists.linuxfoundation.org
Subject: Re: [PATCH] kfence: Replace local_clock() with ktime_get_boot_fast_ns()
Date: Thu, 23 Nov 2023 10:42:48 +0100	[thread overview]
Message-ID: <CANpmjNNHe7YnA_n_Ek4_NJuq99jOH5PJfWtAkc5u8fMNJpFOSA@mail.gmail.com> (raw)
In-Reply-To: <VI1P193MB0752D8881930F88BACFB56A499B9A@VI1P193MB0752.EURP193.PROD.OUTLOOK.COM>

On Thu, 23 Nov 2023 at 10:29, Juntong Deng <juntong.deng@outlook.com> wrote:
>
> On 2023/11/23 6:19, Marco Elver wrote:
> > On Wed, 22 Nov 2023 at 22:36, Juntong Deng <juntong.deng@outlook.com> wrote:
> >>
> >> On 2023/11/23 4:35, Marco Elver wrote:
> >>> On Wed, 22 Nov 2023 at 21:01, Juntong Deng <juntong.deng@outlook.com> wrote:
> >>>>
> >>>> The time obtained by local_clock() is the local CPU time, which may
> >>>> drift between CPUs and is not suitable for comparison across CPUs.
> >>>>
> >>>> It is possible for allocation and free to occur on different CPUs,
> >>>> and using local_clock() to record timestamps may cause confusion.
> >>>
> >>> The same problem exists with printk logging.
> >>>
> >>>> ktime_get_boot_fast_ns() is based on clock sources and can be used
> >>>> reliably and accurately for comparison across CPUs.
> >>>
> >>> You may be right here, however, the choice of local_clock() was
> >>> deliberate: it's the same timestamp source that printk uses.
> >>>
> >>> Also, on systems where there is drift, the arch selects
> >>> CONFIG_HAVE_UNSTABLE_SCHED_CLOCK (like on x86) and the drift is
> >>> generally bounded.
> >>>
> >>>> Signed-off-by: Juntong Deng <juntong.deng@outlook.com>
> >>>> ---
> >>>>    mm/kfence/core.c | 2 +-
> >>>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> >>>> index 3872528d0963..041c03394193 100644
> >>>> --- a/mm/kfence/core.c
> >>>> +++ b/mm/kfence/core.c
> >>>> @@ -295,7 +295,7 @@ metadata_update_state(struct kfence_metadata *meta, enum kfence_object_state nex
> >>>>           track->num_stack_entries = num_stack_entries;
> >>>>           track->pid = task_pid_nr(current);
> >>>>           track->cpu = raw_smp_processor_id();
> >>>> -       track->ts_nsec = local_clock(); /* Same source as printk timestamps. */
> >>>> +       track->ts_nsec = ktime_get_boot_fast_ns();
> >>>
> >>> You have ignored the comment placed here - now it's no longer the same
> >>> source as printk timestamps. I think not being able to correlate
> >>> information from KFENCE reports with timestamps in lines from printk
> >>> is worse.
> >>>
> >>> For now, I have to Nack: Unless you can prove that
> >>> ktime_get_boot_fast_ns() can still be correlated with timestamps from
> >>> printk timestamps, I think this change only trades one problem for
> >>> another.
> >>>
> >>> Thanks,
> >>> -- Marco
> >>
> >> Honestly, the possibility of accurately matching a message in the printk
> >> log by the timestamp in the kfence report is very low, since allocation
> >> and free do not directly correspond to a certain event.
> >
> > It's about being able to compare the timestamps. I don't want to match
> > an exact event, but be able to figure out which event happened
> > before/after an allocation or free, i.e. the logical ordering of
> > events.
> >
> > With CONFIG_PRINTK_CALLER we can see the CPU ID in printk lines and
> > are therefore able to accurately compare printk lines with information
> > given by KFENCE alloc/free info.
> >
>
>
> That makes sense.
>
>
> >> Since time drifts across CPUs, timestamps may be different even if
> >> allocation and free can correspond to a certain event.
> >
> > This is not a problem with CONFIG_PRINTK_CALLER.
> >
> >> If we really need to find the relevant printk logs by the timestamps in
> >> the kfence report, all we can do is to look for messages that are within
> >> a certain time range.
> >>
> >> If we are looking for messages in a certain time range, there is not
> >> much difference between local_clock() and ktime_get_boot_fast_ns().
> >>
> >> Also, this patch is in preparation for my next patch.
> >>
> >> My next patch is to show the PID, CPU number, and timestamp when the
> >> error occurred, in this case time drift from different CPUs can
> >> cause confusion.
> >
> > It's not quite clear how there's a dependency between this patch and a
> > later patch, but generally it's good practice to send related patches
> > as a patch series. That way it's easier to see what the overall
> > changes are and provide feedback as a whole - as is, it's difficult to
> > provide feedback.
> >
> > However, from what you say this information is already given.
> > dump_stack_print_info() shows this - e.g this bit here is printed by
> > where the error occurred:
> >
> > | CPU: 0 PID: 484 Comm: kunit_try_catch Not tainted 5.13.0-rc3+ #7
> > | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2
> > 04/01/2014
> >
> > And if the printk log has timestamps, then these lines are prefixed
> > with the timestamp where the error occurred.
> >
>
>
> Thanks, I found that information.
>
> Since this information is at the bottom of the report, I had previously
> ignored them.
>
> I would suggest considering moving this information to the top of
> the report, for example
>
> BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa6/0x234
>
> CPU: 0 PID: 484 Comm: kunit_try_catch Not tainted 5.13.0-rc3+ #7
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2
> 04/01/2014
>
> Out-of-bounds read at 0xffff8c3f2e291fff (1B left of kfence-#72):
> ...
>
> This more clearly correlates this information with the occurrence of
> the error.

Most kernel warnings/bugs/etc. show this information at the bottom of
the report, hence KFENCE also showing it there. If you look at
kfence_report_error() where it prints this info, there is a mode where
KFENCE also dumps all registers via show_regs(). show_regs() itself
displays this information at the bottom as well, but showing a dump of
registers at the start of the KFENCE report is pretty distracting and
looks ugly.

The placement of this information is not the best, but at the same
time I found it to be the least bad compromise (when also considering
the mode where it dumps registers). We could of course untangle some
of these functions and e.g. have a show_regs() version that doesn't
show that info, but I find that to add more interfaces to the kernel
with unclear gains - overall probably not worth the time effort.

At least that's the reasoning for why things are the way they are
today. If there is an easier way I missed, any clear improvements are
of course welcome.

Thanks,
-- Marco

      reply	other threads:[~2023-11-23  9:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-22 20:00 [PATCH] kfence: Replace local_clock() with ktime_get_boot_fast_ns() Juntong Deng
2023-11-22 20:35 ` Marco Elver
2023-11-22 21:36   ` Juntong Deng
2023-11-22 22:19     ` Marco Elver
2023-11-23  9:29       ` Juntong Deng
2023-11-23  9:42         ` Marco Elver [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANpmjNNHe7YnA_n_Ek4_NJuq99jOH5PJfWtAkc5u8fMNJpFOSA@mail.gmail.com \
    --to=elver@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=juntong.deng@outlook.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).