Linux-perf-users Archive mirror
 help / color / mirror / Atom feed
From: weilin.wang@intel.com
To: weilin.wang@intel.com, Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Kan Liang <kan.liang@linux.intel.com>
Cc: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Perry Taylor <perry.taylor@intel.com>,
	Samantha Alt <samantha.alt@intel.com>,
	Caleb Biggers <caleb.biggers@intel.com>
Subject: [RFC PATCH v8 0/7] TPEBS counting mode support
Date: Wed, 15 May 2024 01:44:22 -0400	[thread overview]
Message-ID: <20240515054443.2824147-1-weilin.wang@intel.com> (raw)

From: Weilin Wang <weilin.wang@intel.com>


Changes in v8:
- In this revision, the code is updated to base on Ian's patch on R modifier
parser https://lore.kernel.org/lkml/20240428053616.1125891-3-irogers@google.com/
After this change, there is no special code required for R modifier in
metricgroup.c and metricgroup.h files.

Caveat of this change:
  Ideally, we will need to add special handling to skip counting events with R
modifier in evsel. Currently, this is not implemented so the event with :R will
be both counted and sampled. Usually, in a metric formula that uses retire_latency,
it would already require to count the event. As a result, we will endup count the
same event twice. This should be able to be handled properly when we finalize our
design on evsel R modifier support.

- Move TPEBS specific code out from main perf stat code to separate files in
util/intel-tpebs.c and util/intel-tpebs.h. [Namhyung]
- Use --control:fifo to ack perf stat from forked perf record instead of sleep(2) [Namhyung]
- Add introductions about TPEBS and R modifier in Documents. [Namhyung]


Changes in v7:
- Update code and comments for better code quality [Namhyung]
- Add a separate commit for perf data [Namhyung]
- Update retire latency print function to improve alignment [Namhyung]

Changes in v6:
- Update code and add comments for better code quality [Namhyung]
- Remove the added fd var and directly pass the opened fd to data.file.fd [Namhyung]
- Add kill() to stop perf record when perf stat exists early [Namhyung]
- Add command opt check to ensure only start perf record when -a/-C given [Namhyung]
- Squash commits [Namhyung]

Changes in v5:
- Update code and add comments for better code quality [Ian]

Changes in v4:
- Remove uncessary debug print and update code and comments for better
readability and quality [Namhyung]
- Update mtl metric json file with consistent TmaL1 and TopdownL1 metricgroup

Changes in v3:
- Remove ':' when event name has '@' [Ian]
- Use 'R' as the modifier instead of "retire_latency" [Ian]

Changes in v2:
- Add MTL metric file
- Add more descriptions and example to the patch [Arnaldo]

Here is an example of running perf stat to collect a metric that uses
retire_latency value of event MEM_INST_RETIRED.STLB_HIT_STORES on a MTL system.

In this simple example, there is no MEM_INST_RETIRED.STLB_HIT_STORES sample.
Therefore, the MEM_INST_RETIRED.STLB_HIT_STORES:p count and retire_latency value
are all 0.

./perf stat -M tma_dtlb_store -a -- sleep 1

[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]

 Performance counter stats for 'system wide':

       181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %  tma_dtlb_store
         3,195,608      cpu_core/topdown-retiring/
        40,156,649      cpu_core/topdown-mem-bound/
         3,550,925      cpu_core/topdown-bad-spec/
       117,571,818      cpu_core/topdown-fe-bound/
        57,118,087      cpu_core/topdown-be-bound/
            69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
             4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
        30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
        30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
           168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
              0.00 MEM_INST_RETIRED.STLB_HIT_STORES:p       0        0

       1.003105924 seconds time elapsed

v1:
TPEBS is one of the features provided by the next generation of Intel PMU.
Please refer to Section 8.4.1 of "Intel® Architecture Instruction Set Extensions
Programming Reference" [1] for more details about this feature.

This set of patches supports TPEBS in counting mode. The code works in the
following way: it forks a perf record process from perf stat when retire_latency
of one or more events are used in a metric formula. Perf stat would send a
SIGTERM signal to perf record before it needs the retire latency value for
metric calculation. Perf stat will then process sample data to extract the
retire latency data for metric calculations. Currently, the code uses the
arithmetic average of retire latency values.

[1] https://www.intel.com/content/www/us/en/content-details/812218/intel-architecture-instruction-set-extensions-programming-reference.html?wapkw=future%20features

Weilin Wang (7):
  perf Document: Add TPEBS to Documents
  perf data: Allow to use given fd in data->file.fd
  perf stat: Fork and launch perf record when perf stat needs to get
    retire latency value for a metric.
  perf stat: Add retire latency values into the expr_parse_ctx to
    prepare for final metric calculation
  perf stat: Add retire latency print functions to print out at the very
    end of print out
  perf vendor events intel: Add MTL metric json files
  perf stat: Skip read retire_lat counters and plugin retire_lat data
    from sampled data

 tools/perf/Documentation/perf-list.txt        |    1 +
 tools/perf/Documentation/topdown.txt          |   18 +
 tools/perf/arch/x86/util/evlist.c             |    6 +
 tools/perf/builtin-stat.c                     |   19 +
 .../arch/x86/meteorlake/metricgroups.json     |  127 +
 .../arch/x86/meteorlake/mtl-metrics.json      | 2551 +++++++++++++++++
 tools/perf/util/Build                         |    1 +
 tools/perf/util/data.c                        |    7 +-
 tools/perf/util/evsel.h                       |    5 +
 tools/perf/util/intel-tpebs.c                 |  285 ++
 tools/perf/util/intel-tpebs.h                 |   29 +
 tools/perf/util/stat-display.c                |   74 +
 tools/perf/util/stat-shadow.c                 |   23 +
 tools/perf/util/stat.h                        |    3 +
 14 files changed, 3148 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/pmu-events/arch/x86/meteorlake/metricgroups.json
 create mode 100644 tools/perf/pmu-events/arch/x86/meteorlake/mtl-metrics.json
 create mode 100644 tools/perf/util/intel-tpebs.c
 create mode 100644 tools/perf/util/intel-tpebs.h

--
2.43.0


             reply	other threads:[~2024-05-15  5:44 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15  5:44 weilin.wang [this message]
2024-05-15  5:44 ` [RFC PATCH v8 1/7] perf Document: Add TPEBS to Documents weilin.wang
2024-05-16 16:10   ` Ian Rogers
2024-05-16 17:37     ` Wang, Weilin
2024-05-17 21:28       ` Namhyung Kim
2024-05-15  5:44 ` [RFC PATCH v8 2/7] perf data: Allow to use given fd in data->file.fd weilin.wang
2024-05-16 16:11   ` Ian Rogers
2024-05-15  5:44 ` [RFC PATCH v8 3/7] perf stat: Fork and launch perf record when perf stat needs to get retire latency value for a metric weilin.wang
2024-05-16 16:43   ` Ian Rogers
2024-05-16 17:38     ` Wang, Weilin
2024-05-16 18:07       ` Ian Rogers
2024-05-16 18:16         ` Wang, Weilin
2024-05-17 21:43   ` Namhyung Kim
2024-05-17 21:57     ` Wang, Weilin
2024-05-21  0:10     ` Wang, Weilin
2024-05-21  5:42       ` Namhyung Kim
2024-05-21 16:23         ` Wang, Weilin
2024-05-15  5:44 ` [RFC PATCH v8 4/7] perf stat: Add retire latency values into the expr_parse_ctx to prepare for final metric calculation weilin.wang
2024-05-16 16:44   ` Ian Rogers
2024-05-15  5:44 ` [RFC PATCH v8 5/7] perf stat: Add retire latency print functions to print out at the very end of print out weilin.wang
2024-05-16 16:47   ` Ian Rogers
2024-05-16 17:51     ` Wang, Weilin
2024-05-16 18:08       ` Ian Rogers
2024-05-15  5:44 ` [RFC PATCH v8 6/7] perf vendor events intel: Add MTL metric json files weilin.wang
2024-05-16 16:57   ` Ian Rogers
2024-05-16 17:44     ` Wang, Weilin
2024-05-15  5:44 ` [RFC PATCH v8 7/7] perf stat: Skip read retire_lat counters and plugin retire_lat data from sampled data weilin.wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240515054443.2824147-1-weilin.wang@intel.com \
    --to=weilin.wang@intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=caleb.biggers@intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=perry.taylor@intel.com \
    --cc=peterz@infradead.org \
    --cc=samantha.alt@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).