Linux-perf-users Archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-perf-users@vger.kernel.org
Subject: Re: [PATCH 4/6] perf annotate-data: Check memory access with two registers
Date: Sat, 4 May 2024 15:26:58 -0300	[thread overview]
Message-ID: <ZjZ98gLSmr0qXih2@x1> (raw)
In-Reply-To: <CAM9d7cg_YL1x8YfJ5+7+o+0dccFJJxUye8L_FLrgdGeAh81LBA@mail.gmail.com>

On Thu, May 02, 2024 at 11:14:50AM -0700, Namhyung Kim wrote:
> On Thu, May 2, 2024 at 7:05 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >
> > On Wed, May 01, 2024 at 11:00:09PM -0700, Namhyung Kim wrote:
> > > The following instruction pattern is used to access a global variable.
> > >
> > >   mov     $0x231c0, %rax
> > >   movsql  %edi, %rcx
> > >   mov     -0x7dc94ae0(,%rcx,8), %rcx
> > >   cmpl    $0x0, 0xa60(%rcx,%rax,1)     <<<--- here
> > >
> > > The first instruction set the address of the per-cpu variable (here, it
> > > is 'runqueus' of struct rq).  The second instruction seems like a cpu
> >
> > You mean 'runqueues', i.e. this one:
> >
> > kernel/sched/core.c
> > DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
> >
> > ?
> 
> Right, sorry for the typo.
> 
> >
> > But that 0xa60 would be in an alignment hole, at least in:
> >
> > $ pahole --hex rq | egrep 0xa40 -A12
> >         struct mm_struct *         prev_mm;              /* 0xa40   0x8 */
> >         unsigned int               clock_update_flags;   /* 0xa48   0x4 */
> >
> >         /* XXX 4 bytes hole, try to pack */
> >
> >         u64                        clock;                /* 0xa50   0x8 */
> >
> >         /* XXX 40 bytes hole, try to pack */
> >
> >         /* --- cacheline 42 boundary (2688 bytes) --- */
> >         u64                        clock_task __attribute__((__aligned__(64))); /* 0xa80   0x8 */
> >         u64                        clock_pelt;           /* 0xa88   0x8 */
> >         long unsigned int          lost_idle_time;       /* 0xa90   0x8 */
> > $ uname -a
> > Linux toolbox 6.7.11-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Mar 27 16:50:39 UTC 2024 x86_64 GNU/Linux
> > $
> 
> This would be different on kernel version, config and
> other changes like backports or local modifications.
> 
> On my system, it was cpu_stop_work.arg.

Sure, so please include the pahole output for the data that lead you to
the conclusions in the explanation for the results obtained, so that we
can have a better mental map of all the pieces and thus get convinced of
the results and have a way to try to reproduce it in our systems.

In the future we will be grateful to this effort when looking back at
these patches :-)

Thanks for all your work in these features!

- Arnaldo
 
> $ pahole --hex rq | grep 0xa40 -C1
>     /* --- cacheline 41 boundary (2624 bytes) --- */
>     struct cpu_stop_work       active_balance_work;  /* 0xa40  0x30 */
>     int                        cpu;                  /* 0xa70   0x4 */
> 
> $ pahole --hex cpu_stop_work
> struct cpu_stop_work {
>     struct list_head           list;                 /*     0  0x10 */
>     cpu_stop_fn_t              fn;                   /*  0x10   0x8 */
>     long unsigned int          caller;               /*  0x18   0x8 */
>     void *                     arg;                  /*  0x20   0x8 */
>     struct cpu_stop_done *     done;                 /*  0x28   0x8 */
> 
>     /* size: 48, cachelines: 1, members: 5 */
>     /* last cacheline: 48 bytes */
> };
> 
> 
> >
> > The paragraph then reads:
> >
> > ----
> > The first instruction set the address of the per-cpu variable (here, it
> > is 'runqueues' of type 'struct rq').  The second instruction seems like
> > a cpu number of the per-cpu base.  The third instruction get the base
> > offset of per-cpu area for that cpu.  The last instruction compares the
> > value of the per-cpu variable at the offset of 0xa60.
> > ----
> >
> > Ok?
> 
> Yep, looks good.
> 
> Thanks,
> Namhyung

  reply	other threads:[~2024-05-04 18:27 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-02  6:00 [PATCHSET 0/6] perf annotate-data: Small updates in the data type profiling (v1) Namhyung Kim
2024-05-02  6:00 ` [PATCH 1/6] perf dwarf-aux: Add die_collect_global_vars() Namhyung Kim
2024-05-02  6:00 ` [PATCH 2/6] perf annotate-data: Collect global variables in advance Namhyung Kim
2024-05-02 13:50   ` Arnaldo Carvalho de Melo
2024-05-02 18:23     ` Namhyung Kim
2024-05-02 23:28       ` Namhyung Kim
2024-05-02  6:00 ` [PATCH 3/6] perf annotate-data: Handle direct global variable access Namhyung Kim
2024-05-02  6:00 ` [PATCH 4/6] perf annotate-data: Check memory access with two registers Namhyung Kim
2024-05-02 14:05   ` Arnaldo Carvalho de Melo
2024-05-02 18:14     ` Namhyung Kim
2024-05-04 18:26       ` Arnaldo Carvalho de Melo [this message]
2024-05-02  6:00 ` [PATCH 5/6] perf annotate-data: Handle multi regs in find_data_type_block() Namhyung Kim
2024-05-02  6:00 ` [PATCH 6/6] perf annotate-data: Check kind of stack variables Namhyung Kim
2024-05-02 14:25 ` [PATCHSET 0/6] perf annotate-data: Small updates in the data type profiling (v1) Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZjZ98gLSmr0qXih2@x1 \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).