All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [linus:master] [x86/syscall] 1e3ad78334: will-it-scale.per_process_ops 1.4% improvement
@ 2024-04-19  5:49 kernel test robot
  2024-04-19  7:33 ` Josh Poimboeuf
  0 siblings, 1 reply; 3+ messages in thread
From: kernel test robot @ 2024-04-19  5:49 UTC (permalink / raw
  To: Linus Torvalds
  Cc: oe-lkp, lkp, linux-kernel, Thomas Gleixner, Daniel Sneddon,
	Josh Poimboeuf, ying.huang, feng.tang, fengwei.yin

Hi Linus,

We noticed that commit 1e3ad78334a6 caused performance fluctuations in
various micro benchmarks. The perf stat metrics related with branch
instructions do have noticeable changes, which may be an expected
result of this commit. We are sending this report to provide these data
and hope it can be helpful for the awareness of overall impact or any
further investigation. Thanks.

kernel test robot noticed a 1.4% improvement of will-it-scale.per_process_ops on:

commit: 1e3ad78334a69b36e107232e337f9d693dcc9df2 ("x86/syscall: Don't force use of indirect calls for system calls")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: will-it-scale
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 16
	mode: process
	test: futex4
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.null.ops_per_sec -4.0% regression                                    |
| test machine     | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters  | cpufreq_governor=performance                                                              |
|                  | nr_threads=100%                                                                           |
|                  | test=null                                                                                 |
|                  | testtime=60s                                                                              |
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.fpunch.ops_per_sec -1.6% regression                                  |
| test machine     | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters  | cpufreq_governor=performance                                                              |
|                  | disk=1HDD                                                                                 |
|                  | fs=ext4                                                                                   |
|                  | nr_threads=100%                                                                           |
|                  | test=fpunch                                                                               |
|                  | testtime=60s                                                                              |
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | unixbench: unixbench.throughput -1.4% regression                                          |
| test machine     | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters  | cpufreq_governor=performance                                                              |
|                  | nr_task=100%                                                                              |
|                  | runtime=300s                                                                              |
|                  | test=fsbuffer                                                                             |
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -1.1% regression                             |
| test machine     | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters  | cpufreq_governor=performance                                                              |
|                  | mode=process                                                                              |
|                  | nr_task=100%                                                                              |
|                  | test=pread1                                                                               |
+------------------+-------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops -3.4% regression                              |
| test machine     | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory |
| test parameters  | cpufreq_governor=performance                                                              |
|                  | mode=thread                                                                               |
|                  | nr_task=100%                                                                              |
|                  | test=poll1                                                                                |
+------------------+-------------------------------------------------------------------------------------------+

Details are as below:

The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240419/202404191333.178a0eed-yujie.liu@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-13/performance/x86_64-rhel-8.3/process/16/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/futex4/will-it-scale

commit: 
  0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
  1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    860611            -1.4%     848885        proc-vmstat.numa_hit
    753301            -1.6%     741136        proc-vmstat.numa_local
  21797058            +1.4%   22102512        will-it-scale.16.processes
   1362315            +1.4%    1381406        will-it-scale.per_process_ops
  21797058            +1.4%   22102512        will-it-scale.workload
      0.04 ±  7%      -7.4%       0.04        perf-stat.i.MPKI
  1.98e+09           +19.2%   2.36e+09        perf-stat.i.branch-instructions
      1.47            -1.2        0.30        perf-stat.i.branch-miss-rate%
  30820475           -70.4%    9118612        perf-stat.i.branch-misses
      3.45            -4.4%       3.30        perf-stat.i.cpi
 1.504e+10            +5.1%   1.58e+10        perf-stat.i.instructions
      0.29            +4.5%       0.31        perf-stat.i.ipc
      0.05 ±  2%      -4.2%       0.04        perf-stat.overall.MPKI
      1.56            -1.2        0.39        perf-stat.overall.branch-miss-rate%
      3.43            -4.3%       3.28        perf-stat.overall.cpi
      0.29            +4.5%       0.30        perf-stat.overall.ipc
    208138            +3.4%     215312        perf-stat.overall.path-length
 1.973e+09           +19.2%  2.353e+09        perf-stat.ps.branch-instructions
  30729762           -70.4%    9109071        perf-stat.ps.branch-misses
 1.499e+10            +5.1%  1.575e+10        perf-stat.ps.instructions
 4.537e+12            +4.9%  4.759e+12        perf-stat.total.instructions
     12.23            -0.6       11.60        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     10.09            -0.6        9.51        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     22.31            -0.4       21.88        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
      9.25            +0.2        9.43        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
      8.79            +0.2        9.02        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      7.13            +0.2        7.36        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
      8.37            +0.3        8.63        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     12.38            -0.6       11.78        perf-profile.children.cycles-pp.do_syscall_64
     10.12            -0.5        9.57        perf-profile.children.cycles-pp.__x64_sys_futex
     22.63            -0.4       22.20        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.48 ±  2%      -0.0        0.46        perf-profile.children.cycles-pp.get_futex_key
      0.00            +0.2        0.18 ±  2%  perf-profile.children.cycles-pp.x64_sys_call
      9.11            +0.2        9.29        perf-profile.children.cycles-pp.entry_SYSCALL_64
      8.88            +0.2        9.11        perf-profile.children.cycles-pp.do_futex
      7.13            +0.2        7.36        perf-profile.children.cycles-pp.__futex_wait
      8.43            +0.3        8.70        perf-profile.children.cycles-pp.futex_wait
      1.20            -0.7        0.47        perf-profile.self.cycles-pp.__x64_sys_futex
      1.46            -0.2        1.27        perf-profile.self.cycles-pp.do_syscall_64
      0.51            -0.1        0.44        perf-profile.self.cycles-pp.do_futex
      0.38 ±  5%      -0.1        0.32 ±  4%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.48 ±  2%      -0.0        0.45        perf-profile.self.cycles-pp.get_futex_key
      0.00            +0.1        0.15 ±  2%  perf-profile.self.cycles-pp.x64_sys_call
      7.97            +0.1        8.12        perf-profile.self.cycles-pp.entry_SYSCALL_64
     10.43            +0.2       10.60        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.72 ±  3%      +0.2        0.94 ±  3%  perf-profile.self.cycles-pp.__futex_wait


***************************************************************************************************
lkp-icl-2sp8: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-13/performance/1HDD/btrfs/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/utime/stress-ng/60s

commit: 
  0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
  1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    136026 ±  3%     +20.6%     164016 ± 11%  meminfo.DirectMap4k
 5.516e+10            +1.5%  5.598e+10        perf-stat.i.branch-instructions
 5.427e+10            +1.5%  5.508e+10        perf-stat.ps.branch-instructions
    137060 ± 23%     +35.5%     185722 ±  7%  numa-numastat.node0.local_node
     50345 ± 26%     -56.2%      22060 ± 77%  numa-numastat.node0.other_node
    289383 ±  9%     -17.6%     238445 ±  6%  numa-numastat.node1.local_node
     15965 ± 85%    +177.3%      44264 ± 38%  numa-numastat.node1.other_node
    136562 ± 23%     +35.6%     185165 ±  7%  numa-vmstat.node0.numa_local
     50345 ± 26%     -56.2%      22060 ± 77%  numa-vmstat.node0.numa_other
    288523 ±  9%     -17.7%     237526 ±  6%  numa-vmstat.node1.numa_local
     15965 ± 85%    +177.3%      44264 ± 38%  numa-vmstat.node1.numa_other
      1.71            -0.5        1.18        perf-profile.calltrace.cycles-pp.mnt_want_write.vfs_utimes.do_utimes.__x64_sys_utimensat.do_syscall_64
     43.01            -0.3       42.68        perf-profile.calltrace.cycles-pp.user_path_at_empty.do_utimes.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe
     23.61            -0.3       23.34        perf-profile.calltrace.cycles-pp.do_utimes.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe.utimensat
     26.52            -0.2       26.27        perf-profile.calltrace.cycles-pp.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe.utimensat
     16.22            -0.2       16.00        perf-profile.calltrace.cycles-pp.do_utimes.__x64_sys_utime.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     13.89            -0.2       13.68        perf-profile.calltrace.cycles-pp.user_path_at_empty.do_utimes.__x64_sys_utime.do_syscall_64.entry_SYSCALL_64_after_hwframe
     39.07            -0.2       38.87        perf-profile.calltrace.cycles-pp.do_utimes.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe
     16.75            -0.2       16.56        perf-profile.calltrace.cycles-pp.__x64_sys_utime.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     15.77            -0.2       15.58        perf-profile.calltrace.cycles-pp.filename_lookup.user_path_at_empty.do_utimes.__x64_sys_utimensat.do_syscall_64
     10.55            -0.2       10.37        perf-profile.calltrace.cycles-pp.getname_flags.user_path_at_empty.do_utimes.__x64_sys_utime.do_syscall_64
     13.78            -0.2       13.60        perf-profile.calltrace.cycles-pp.path_lookupat.filename_lookup.user_path_at_empty.do_utimes.__x64_sys_utimensat
      9.48            -0.2        9.31        perf-profile.calltrace.cycles-pp.strncpy_from_user.getname_flags.user_path_at_empty.do_utimes.__x64_sys_utime
     29.46            -0.1       29.32        perf-profile.calltrace.cycles-pp.utimensat
     25.18            -0.1       25.05        perf-profile.calltrace.cycles-pp.getname_flags.user_path_at_empty.do_utimes.__x64_sys_utimensat.do_syscall_64
     21.74            -0.1       21.62        perf-profile.calltrace.cycles-pp.strncpy_from_user.getname_flags.user_path_at_empty.do_utimes.__x64_sys_utimensat
     27.48            -0.1       27.35        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.utimensat
     43.89            -0.1       43.77        perf-profile.calltrace.cycles-pp.__x64_sys_utimensat.do_syscall_64.entry_SYSCALL_64_after_hwframe
     17.24            -0.1       17.13        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     27.21            -0.1       27.11        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.utimensat
     17.10            -0.1       17.00        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     18.02            -0.1       17.93        perf-profile.calltrace.cycles-pp.syscall
      3.82            -0.1        3.76        perf-profile.calltrace.cycles-pp.__check_object_size.strncpy_from_user.getname_flags.user_path_at_empty.do_utimes
      0.57            -0.0        0.54        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.61            -0.0        1.58        perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.strncpy_from_user.getname_flags.user_path_at_empty
      2.91            -0.0        2.88        perf-profile.calltrace.cycles-pp.filename_lookup.user_path_at_empty.do_utimes.__x64_sys_utime.do_syscall_64
      2.43            -0.0        2.40        perf-profile.calltrace.cycles-pp.path_lookupat.filename_lookup.user_path_at_empty.do_utimes.__x64_sys_utime
     45.81            +0.1       45.96        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     45.27            +0.2       45.45        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     79.22            -0.7       78.54        perf-profile.children.cycles-pp.do_utimes
     57.10            -0.5       56.56        perf-profile.children.cycles-pp.user_path_at_empty
     70.66            -0.4       70.29        perf-profile.children.cycles-pp.__x64_sys_utimensat
     36.81            -0.3       36.49        perf-profile.children.cycles-pp.getname_flags
     31.75            -0.3       31.45        perf-profile.children.cycles-pp.strncpy_from_user
     20.12            -0.2       19.91        perf-profile.children.cycles-pp.filename_lookup
     17.70            -0.2       17.50        perf-profile.children.cycles-pp.path_lookupat
     16.79            -0.2       16.60        perf-profile.children.cycles-pp.__x64_sys_utime
     29.54            -0.1       29.40        perf-profile.children.cycles-pp.utimensat
     18.34            -0.1       18.25        perf-profile.children.cycles-pp.syscall
     19.31            -0.1       19.22        perf-profile.children.cycles-pp.vfs_utimes
      4.47            -0.1        4.40        perf-profile.children.cycles-pp.__check_object_size
      1.32            -0.1        1.26        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      3.38            -0.1        3.34        perf-profile.children.cycles-pp.walk_component
      2.56            -0.0        2.52        perf-profile.children.cycles-pp.lookup_fast
      2.08            -0.0        2.04        perf-profile.children.cycles-pp.__d_lookup_rcu
      2.33            -0.0        2.30        perf-profile.children.cycles-pp.check_heap_object
      2.44            -0.0        2.41        perf-profile.children.cycles-pp.complete_walk
      1.07            -0.0        1.05        perf-profile.children.cycles-pp.make_vfsuid
      1.30            -0.0        1.28        perf-profile.children.cycles-pp.path_put
      0.84            +0.0        0.88        perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.00            +0.6        0.63        perf-profile.children.cycles-pp.x64_sys_call
     27.25            -0.2       27.02        perf-profile.self.cycles-pp.strncpy_from_user
      1.30            -0.1        1.22        perf-profile.self.cycles-pp.do_syscall_64
      0.24            -0.0        0.23        perf-profile.self.cycles-pp.may_setattr
      0.12            +0.0        0.15 ±  3%  perf-profile.self.cycles-pp.__x64_sys_utime
      0.84            +0.0        0.88        perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.92            +0.1        1.04        perf-profile.self.cycles-pp.__x64_sys_utimensat
      0.00            +0.5        0.55        perf-profile.self.cycles-pp.x64_sys_call



***************************************************************************************************
lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/readahead/stress-ng/60s

commit: 
  0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
  1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 5.631e+10            +2.8%  5.787e+10        perf-stat.i.branch-instructions
  5.54e+10            +2.8%  5.695e+10        perf-stat.ps.branch-instructions
     55177 ± 10%     +36.4%      75281 ± 12%  sched_debug.cfs_rq:/.avg_vruntime.stddev
     55177 ± 10%     +36.4%      75281 ± 12%  sched_debug.cfs_rq:/.min_vruntime.stddev
     46.20            -0.5       45.74        perf-profile.calltrace.cycles-pp.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
     35.83            -0.4       35.38        perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
     20.24            -0.3       19.90        perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read.__x64_sys_pread64
     20.87            -0.3       20.54        perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.vfs_read.__x64_sys_pread64.do_syscall_64
      1.66            -0.1        1.58        perf-profile.calltrace.cycles-pp.__fdget.ksys_readahead.do_syscall_64.entry_SYSCALL_64_after_hwframe.readahead
      0.66 ±  3%      -0.1        0.60 ±  2%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.readahead.stress_run
      0.63 ±  4%      -0.0        0.58 ±  2%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.readahead
      4.29            -0.0        4.25        perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.20            -0.0        2.16        perf-profile.calltrace.cycles-pp.touch_atime.filemap_read.vfs_read.__x64_sys_pread64.do_syscall_64
      1.88            -0.0        1.85        perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_read.vfs_read.__x64_sys_pread64
      4.33 ±  3%      +0.3        4.68 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.readahead
      3.66 ±  3%      +0.4        4.05 ±  3%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.readahead
     46.41            -0.5       45.94        perf-profile.children.cycles-pp.vfs_read
     48.17            -0.5       47.71        perf-profile.children.cycles-pp.__x64_sys_pread64
     36.13            -0.5       35.68        perf-profile.children.cycles-pp.filemap_read
     20.30            -0.3       19.96        perf-profile.children.cycles-pp._copy_to_iter
     20.97            -0.3       20.64        perf-profile.children.cycles-pp.copy_page_to_iter
     55.86            -0.3       55.60        perf-profile.children.cycles-pp.__libc_pread
     24.71            -0.2       24.48        perf-profile.children.cycles-pp.stress_readahead
      2.62            -0.2        2.46        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      4.54            -0.1        4.45        perf-profile.children.cycles-pp.ksys_readahead
      5.33            -0.0        5.28        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      4.40            -0.0        4.36        perf-profile.children.cycles-pp.__fsnotify_parent
      2.28            -0.0        2.26        perf-profile.children.cycles-pp.touch_atime
      2.06            -0.0        2.04        perf-profile.children.cycles-pp.atime_needs_update
      0.08 ±  8%      -0.0        0.05 ±  8%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.78            +0.0        0.81        perf-profile.children.cycles-pp.posix_fadvise
     59.97            +0.3       60.27        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     58.28            +0.3       58.60        perf-profile.children.cycles-pp.do_syscall_64
     18.84            +0.5       19.32        perf-profile.children.cycles-pp.readahead
      0.00            +1.2        1.22        perf-profile.children.cycles-pp.x64_sys_call
     20.09            -0.3       19.76        perf-profile.self.cycles-pp._copy_to_iter
     24.32            -0.2       24.08        perf-profile.self.cycles-pp.stress_readahead
      2.65            -0.2        2.47        perf-profile.self.cycles-pp.do_syscall_64
      4.84            -0.0        4.80        perf-profile.self.cycles-pp.filemap_read
      5.16            -0.0        5.11        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      4.27            -0.0        4.22        perf-profile.self.cycles-pp.__fsnotify_parent
      0.08 ±  6%      -0.0        0.05 ±  7%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      1.82            -0.0        1.80        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.70            +0.0        0.72        perf-profile.self.cycles-pp.__x64_sys_pread64
      0.00            +1.1        1.06        perf-profile.self.cycles-pp.x64_sys_call



***************************************************************************************************
lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/null/stress-ng/60s

commit: 
  0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
  1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     19402 ± 14%     +63.7%      31762 ± 28%  sched_debug.cpu.nr_switches.max
      3272 ± 10%     +40.4%       4595 ± 21%  sched_debug.cpu.nr_switches.stddev
      3241           +10.1%       3569 ±  9%  vmstat.system.cs
    162368            -0.9%     160961        vmstat.system.in
   6303220            -3.7%    6068707        proc-vmstat.numa_hit
   6236896            -3.8%    6002419        proc-vmstat.numa_local
   6341375            -3.7%    6107478        proc-vmstat.pgalloc_normal
   6171078            -3.7%    5941105        proc-vmstat.pgfault
   6144519            -3.8%    5913179        proc-vmstat.pgfree
     19272            -3.3%      18627        stress-ng.null.MB_per_sec_/dev/null_write_rate
 2.902e+09            -4.0%  2.787e+09        stress-ng.null.ops
  48365768            -4.0%   46449880        stress-ng.null.ops_per_sec
   5809136            -3.9%    5580207        stress-ng.time.minor_page_faults
      2394            +1.6%       2431        stress-ng.time.system_time
      1324            -2.7%       1289        stress-ng.time.user_time
 3.529e+10           +18.8%   4.19e+10        perf-stat.i.branch-instructions
      0.24 ±  3%      -0.1        0.19 ±  3%  perf-stat.i.branch-miss-rate%
  85202098 ±  4%      -9.4%   77223454 ±  3%  perf-stat.i.branch-misses
      3168 ±  2%     +11.4%       3529 ± 10%  perf-stat.i.context-switches
      1.03            -2.7%       1.00        perf-stat.i.cpi
 1.897e+11            +2.7%  1.949e+11        perf-stat.i.instructions
      0.97            +2.7%       1.00        perf-stat.i.ipc
      3.14            -3.8%       3.03        perf-stat.i.metric.K/sec
    100663            -3.8%      96871        perf-stat.i.minor-faults
    100663            -3.8%      96871        perf-stat.i.page-faults
      0.24 ±  3%      -0.1        0.18 ±  3%  perf-stat.overall.branch-miss-rate%
      1.03            -2.7%       1.00        perf-stat.overall.cpi
      0.97            +2.7%       1.00        perf-stat.overall.ipc
 3.471e+10           +18.7%  4.121e+10        perf-stat.ps.branch-instructions
  83783190 ±  3%      -9.8%   75603241 ±  3%  perf-stat.ps.branch-misses
      3114 ±  2%     +11.0%       3457 ± 10%  perf-stat.ps.context-switches
 1.866e+11            +2.7%  1.916e+11        perf-stat.ps.instructions
     98965            -3.8%      95242        perf-stat.ps.minor-faults
     98966            -3.8%      95242        perf-stat.ps.page-faults
 1.139e+13            +2.6%  1.169e+13        perf-stat.total.instructions
      4.88 ±  2%      -0.3        4.62        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.llseek
      4.94 ±  2%      -0.2        4.74        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      3.29 ±  2%      -0.2        3.12        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.ioctl
      3.26            -0.1        3.13        perf-profile.calltrace.cycles-pp.setfl.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.30            -0.1        3.17        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.fallocate64
      2.48            -0.1        2.36        perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      2.34 ±  2%      -0.1        2.21        perf-profile.calltrace.cycles-pp.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.12            -0.1        2.01        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
      0.84 ±  2%      -0.1        0.75 ±  2%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.86 ±  2%      -0.1        0.76 ±  2%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      0.89 ±  3%      -0.1        0.80        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek.stress_run
      1.63 ±  2%      -0.1        1.55        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write
      0.86 ±  3%      -0.1        0.79 ±  2%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.stress_run
      2.40 ±  2%      -0.1        2.33        perf-profile.calltrace.cycles-pp.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe.stress_run
      1.26            -0.1        1.20        perf-profile.calltrace.cycles-pp.__put_user_4.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.58 ±  3%      -0.1        0.52        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      1.45            -0.1        1.40        perf-profile.calltrace.cycles-pp._raw_spin_lock.setfl.do_fcntl.__x64_sys_fcntl.do_syscall_64
      0.55            +0.0        0.58        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
      0.86 ±  3%      +0.1        0.91 ±  2%  perf-profile.calltrace.cycles-pp.__fdget_raw.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe.stress_run
      3.11            +0.1        3.19        perf-profile.calltrace.cycles-pp.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
      1.50 ±  2%      +0.1        1.60 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fdatasync
      0.70            +0.1        0.83        perf-profile.calltrace.cycles-pp.__fdget.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
      1.15 ±  2%      +0.1        1.29 ±  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fdatasync
      1.41 ±  2%      +0.1        1.55 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fdatasync.stress_run
      1.18 ±  2%      +0.2        1.35 ±  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fdatasync.stress_run
      3.90 ±  2%      +0.2        4.08 ±  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.stress_run
      3.95 ±  2%      +0.2        4.14        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      1.36 ±  2%      +0.2        1.55 ±  2%  perf-profile.calltrace.cycles-pp.__fdget.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      4.08 ±  2%      +0.2        4.31        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek
      4.66 ±  3%      +0.3        4.91        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
      4.06 ±  3%      +0.3        4.36        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl.stress_run
      4.58            +0.3        4.91        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
      5.07            +0.3        5.40        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fallocate64
      4.20 ±  3%      +0.3        4.54 ±  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek.stress_run
      6.71 ±  2%      +0.4        7.07        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.stress_run
      0.17 ±141%      +0.4        0.58 ±  2%  perf-profile.calltrace.cycles-pp.__x64_sys_fdatasync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fdatasync.stress_run
      0.00            +0.5        0.55 ±  2%  perf-profile.calltrace.cycles-pp.__x64_sys_fdatasync.do_syscall_64.entry_SYSCALL_64_after_hwframe.fdatasync
      8.22            -0.6        7.61        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     16.53            -0.6       15.94        perf-profile.children.cycles-pp.entry_SYSCALL_64
     16.36            -0.6       15.80        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      5.04            -0.2        4.83        perf-profile.children.cycles-pp.do_fcntl
      5.14 ±  2%      -0.2        4.95        perf-profile.children.cycles-pp.vfs_write
      9.50            -0.2        9.31        perf-profile.children.cycles-pp.__x64_sys_fcntl
      3.54            -0.1        3.40        perf-profile.children.cycles-pp.setfl
      2.62            -0.1        2.48        perf-profile.children.cycles-pp.do_vfs_ioctl
      2.56            -0.1        2.47        perf-profile.children.cycles-pp.stress_null
      1.89            -0.1        1.80        perf-profile.children.cycles-pp.amd_clear_divider
      1.74            -0.1        1.65        perf-profile.children.cycles-pp.__libc_fcntl64
      1.38            -0.1        1.30        perf-profile.children.cycles-pp.__put_user_4
      1.54            -0.1        1.49        perf-profile.children.cycles-pp._raw_spin_lock
      0.44 ±  4%      -0.1        0.39        perf-profile.children.cycles-pp.__munmap
      0.42 ±  4%      -0.1        0.37        perf-profile.children.cycles-pp.__vm_munmap
      0.42 ±  4%      -0.1        0.37        perf-profile.children.cycles-pp.__x64_sys_munmap
      0.40 ±  4%      -0.0        0.36        perf-profile.children.cycles-pp.do_vmi_align_munmap
      2.46            -0.0        2.41        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.42 ±  4%      -0.0        0.38        perf-profile.children.cycles-pp.do_vmi_munmap
      0.32 ±  4%      -0.0        0.28        perf-profile.children.cycles-pp.unmap_region
      0.24 ±  3%      -0.0        0.22 ±  2%  perf-profile.children.cycles-pp.asm_exc_page_fault
      0.31 ±  3%      -0.0        0.29        perf-profile.children.cycles-pp.__mmap
      0.22 ±  3%      -0.0        0.19        perf-profile.children.cycles-pp.do_user_addr_fault
      0.55            -0.0        0.52        perf-profile.children.cycles-pp.fcntl64@plt
      0.75            -0.0        0.72        perf-profile.children.cycles-pp.security_file_fcntl
      0.29 ±  3%      -0.0        0.27        perf-profile.children.cycles-pp.vm_mmap_pgoff
      0.28 ±  3%      -0.0        0.25        perf-profile.children.cycles-pp.do_mmap
      0.22 ±  2%      -0.0        0.20 ±  2%  perf-profile.children.cycles-pp.exc_page_fault
      0.20 ±  3%      -0.0        0.18 ±  2%  perf-profile.children.cycles-pp.mmap_region
      0.56            -0.0        0.54        perf-profile.children.cycles-pp.null_lseek
      0.53            -0.0        0.51        perf-profile.children.cycles-pp.security_file_ioctl
      0.18 ±  3%      -0.0        0.16 ±  2%  perf-profile.children.cycles-pp.handle_mm_fault
      0.15 ±  7%      -0.0        0.13 ±  2%  perf-profile.children.cycles-pp.tlb_finish_mmu
      0.16 ±  3%      -0.0        0.15 ±  3%  perf-profile.children.cycles-pp.__handle_mm_fault
      0.12 ±  6%      -0.0        0.10        perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
      0.07            -0.0        0.06        perf-profile.children.cycles-pp.__anon_vma_prepare
      3.28            +0.1        3.35        perf-profile.children.cycles-pp.__x64_sys_fallocate
      7.43            +0.1        7.51        perf-profile.children.cycles-pp.fdatasync
      4.15            +0.1        4.23        perf-profile.children.cycles-pp.syscall_return_via_sysret
      1.07            +0.1        1.22        perf-profile.children.cycles-pp.__x64_sys_fdatasync
      2.93            +0.4        3.35 ±  2%  perf-profile.children.cycles-pp.__fdget
     52.11            +1.6       53.71        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     47.02            +1.9       48.87        perf-profile.children.cycles-pp.do_syscall_64
      0.00            +3.4        3.40        perf-profile.children.cycles-pp.x64_sys_call
      8.38            -0.7        7.68        perf-profile.self.cycles-pp.do_syscall_64
     15.83            -0.6       15.27        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      5.47            -0.3        5.20        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      5.17            -0.2        4.98        perf-profile.self.cycles-pp.entry_SYSCALL_64
      4.64            -0.2        4.49        perf-profile.self.cycles-pp.llseek
      4.24            -0.1        4.10        perf-profile.self.cycles-pp.ioctl
      4.36            -0.1        4.22        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      1.94            -0.1        1.84        perf-profile.self.cycles-pp.stress_null
      2.02            -0.1        1.93        perf-profile.self.cycles-pp.fdatasync
      2.18            -0.1        2.10        perf-profile.self.cycles-pp.fallocate64
      2.01            -0.1        1.94        perf-profile.self.cycles-pp.setfl
      1.33            -0.1        1.26        perf-profile.self.cycles-pp.__put_user_4
      1.54            -0.1        1.47        perf-profile.self.cycles-pp.do_fcntl
      1.19            -0.1        1.14 ±  2%  perf-profile.self.cycles-pp.do_vfs_ioctl
      1.34            -0.1        1.29        perf-profile.self.cycles-pp._raw_spin_lock
      2.30            -0.0        2.26        perf-profile.self.cycles-pp.__x64_sys_fcntl
      1.97            -0.0        1.93        perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      0.97            -0.0        0.94        perf-profile.self.cycles-pp.amd_clear_divider
      0.56            -0.0        0.54        perf-profile.self.cycles-pp.security_file_fcntl
      0.39            -0.0        0.37        perf-profile.self.cycles-pp.fcntl64@plt
      0.30            -0.0        0.29        perf-profile.self.cycles-pp.rw_verify_area
      0.36            -0.0        0.35        perf-profile.self.cycles-pp.security_file_ioctl
      0.44            +0.0        0.48 ±  2%  perf-profile.self.cycles-pp.__x64_sys_fallocate
      0.34            +0.0        0.37 ±  6%  perf-profile.self.cycles-pp.__x64_sys_fdatasync
      4.14            +0.1        4.23        perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.57            +0.1        1.66 ±  2%  perf-profile.self.cycles-pp.__fdget_raw
      2.61            +0.4        3.06 ±  2%  perf-profile.self.cycles-pp.__fdget
      0.00            +2.9        2.89        perf-profile.self.cycles-pp.x64_sys_call



***************************************************************************************************
lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/monte-carlo/stress-ng/60s

commit: 
  0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
  1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1.74            -0.1        1.62        perf-stat.overall.branch-miss-rate%
 1.411e+13            +1.2%  1.427e+13        perf-stat.total.instructions
   2838242            -1.6%    2793803        stress-ng.monte-carlo.samples/sec,_e_using_arc4
   7122323            -1.2%    7036665        stress-ng.monte-carlo.samples/sec,_exp_using_arc4
   3972723            -1.5%    3911813        stress-ng.monte-carlo.samples/sec,_pi_using_arc4
 1.016e+08            -1.3%  1.004e+08        stress-ng.monte-carlo.samples/sec,_pi_using_lcg
   6407021            -1.4%    6319313        stress-ng.monte-carlo.samples/sec,_sin_using_arc4
   7374513            -1.3%    7277983        stress-ng.monte-carlo.samples/sec,_sqrt_using_arc4
   3962914            -1.5%    3904274        stress-ng.monte-carlo.samples/sec,_squircle_using_arc4
      1108            +1.8%       1128        stress-ng.time.system_time
      3.02            -0.3        2.69        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
      3.40 ±  2%      -0.2        3.16 ±  2%  perf-profile.calltrace.cycles-pp.__x64_sys_getpid.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
      2.93 ±  3%      -0.2        2.71 ±  3%  perf-profile.calltrace.cycles-pp.__task_pid_nr_ns.__x64_sys_getpid.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid
     17.72            -0.2       17.52        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__getpid.stress_mc_arc4_rand
      0.94            -0.0        0.91        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid
      3.13            -0.0        3.10        perf-profile.calltrace.cycles-pp.stress_mc_mwc64_rand
      2.07            -0.0        2.04        perf-profile.calltrace.cycles-pp.stress_mwc64.stress_mc_mwc64_rand
      0.56            -0.0        0.53        perf-profile.calltrace.cycles-pp.stress_monte_carlo_sqrt.stress_mc_arc4_rand
     43.17            +0.5       43.65        perf-profile.calltrace.cycles-pp.stress_mc_arc4_rand
     34.06            +0.5       34.58        perf-profile.calltrace.cycles-pp.__getpid.stress_mc_arc4_rand
     13.54            +0.8       14.33        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
     10.40            +1.0       11.40        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
      0.00            +1.4        1.38        perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.__getpid.stress_mc_arc4_rand
      3.94            -0.4        3.58        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      3.85 ±  2%      -0.2        3.61 ±  2%  perf-profile.children.cycles-pp.__x64_sys_getpid
      3.15 ±  2%      -0.2        2.93 ±  3%  perf-profile.children.cycles-pp.__task_pid_nr_ns
      7.88            -0.1        7.76        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      4.44            -0.1        4.37        perf-profile.children.cycles-pp.stress_mc_xorshift_rand
      1.18            -0.1        1.13        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      2.10            -0.0        2.06        perf-profile.children.cycles-pp.stress_monte_carlo_pi
      2.38            -0.0        2.35        perf-profile.children.cycles-pp.stress_monte_carlo_sqrt
      2.32            -0.0        2.29        perf-profile.children.cycles-pp.stress_mwc64
     35.10            +0.6       35.66        perf-profile.children.cycles-pp.__getpid
     27.59            +0.7       28.32        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     25.10            +0.7       25.84        perf-profile.children.cycles-pp.do_syscall_64
      0.00            +1.6        1.61        perf-profile.children.cycles-pp.x64_sys_call
      3.90            -0.2        3.68        perf-profile.self.cycles-pp.do_syscall_64
      2.90 ±  2%      -0.2        2.69 ±  3%  perf-profile.self.cycles-pp.__task_pid_nr_ns
      7.64            -0.1        7.52        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      2.08            -0.1        2.00        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.95            -0.0        0.90        perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      2.76            -0.0        2.72        perf-profile.self.cycles-pp.stress_mc_xorshift_rand
      1.64            -0.0        1.60        perf-profile.self.cycles-pp.stress_monte_carlo_pi
      2.05            -0.0        2.02        perf-profile.self.cycles-pp.stress_mwc64
      0.00            +1.4        1.37        perf-profile.self.cycles-pp.x64_sys_call



***************************************************************************************************
lkp-icl-2sp8: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-13/performance/1HDD/ext4/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/fpunch/stress-ng/60s

commit: 
  0cd01ac5dc ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
  1e3ad78334 ("x86/syscall: Don't force use of indirect calls for system calls")

0cd01ac5dcb1e18e 1e3ad78334a69b36e107232e337 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 4.408e+10            +4.9%  4.623e+10        perf-stat.i.branch-instructions
      0.21            -0.0        0.19 ±  3%  perf-stat.overall.branch-miss-rate%
 4.336e+10            +4.9%  4.547e+10        perf-stat.ps.branch-instructions
 1.054e+08            -1.6%  1.037e+08        stress-ng.fpunch.ops
   1756286            -1.6%    1727644        stress-ng.fpunch.ops_per_sec
    879217            -2.0%     861604        stress-ng.time.voluntary_context_switches
     38.90            -0.6       38.29        perf-profile.calltrace.cycles-pp.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
     31.84            -0.5       31.30        perf-profile.calltrace.cycles-pp.generic_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe
     40.48            -0.4       40.04        perf-profile.calltrace.cycles-pp.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
     19.84            -0.4       19.48        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     22.29            -0.4       21.94        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
     16.32            -0.3       16.01        perf-profile.calltrace.cycles-pp.generic_file_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     26.13            -0.3       25.87        perf-profile.calltrace.cycles-pp.write
     23.37            -0.3       23.11        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
     47.74            -0.2       47.50        perf-profile.calltrace.cycles-pp.__libc_pwrite
      0.52            -0.2        0.34 ± 70%  perf-profile.calltrace.cycles-pp.__mutex_unlock_slowpath.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      2.06            -0.1        1.98        perf-profile.calltrace.cycles-pp.up_write.generic_file_write_iter.vfs_write.__x64_sys_pwrite64.do_syscall_64
      1.33            -0.1        1.28        perf-profile.calltrace.cycles-pp.rwsem_wake.up_write.generic_file_write_iter.vfs_write.__x64_sys_pwrite64
      0.64 ±  2%      -0.1        0.59        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
      2.62            -0.0        2.58        perf-profile.calltrace.cycles-pp.simple_write_end.generic_perform_write.generic_file_write_iter.vfs_write.__x64_sys_pwrite64
      1.02            -0.0        0.99        perf-profile.calltrace.cycles-pp.up_write.generic_file_write_iter.vfs_write.ksys_write.do_syscall_64
      0.66            -0.0        0.64        perf-profile.calltrace.cycles-pp.rwsem_wake.up_write.generic_file_write_iter.vfs_write.ksys_write
      1.48            -0.0        1.46        perf-profile.calltrace.cycles-pp.simple_write_end.generic_perform_write.generic_file_write_iter.vfs_write.ksys_write
      0.66            -0.0        0.64        perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_write.ksys_write.do_syscall_64
      1.06            +0.0        1.08        perf-profile.calltrace.cycles-pp.vfs_fallocate.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      0.83 ±  3%      +0.1        0.91        perf-profile.calltrace.cycles-pp.__fdget.__x64_sys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
      1.60            +0.1        1.71        perf-profile.calltrace.cycles-pp.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      0.71 ±  2%      +0.1        0.82        perf-profile.calltrace.cycles-pp.__fdget.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
      3.00            +0.2        3.18        perf-profile.calltrace.cycles-pp.__x64_sys_fallocate.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
      5.63            +0.2        5.85        perf-profile.calltrace.cycles-pp.syscall
      2.37            +0.2        2.60        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      2.65            +0.2        2.89        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     10.09            +0.3       10.44        perf-profile.calltrace.cycles-pp.fallocate64
      4.37            +0.4        4.74        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fallocate64
      4.84            +0.4        5.21        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fallocate64
     59.21            -1.0       58.23        perf-profile.children.cycles-pp.vfs_write
     48.48            -0.9       47.62        perf-profile.children.cycles-pp.generic_file_write_iter
     40.60            -0.4       40.17        perf-profile.children.cycles-pp.__x64_sys_pwrite64
     22.42            -0.4       22.06        perf-profile.children.cycles-pp.ksys_write
     26.21            -0.3       25.96        perf-profile.children.cycles-pp.write
     47.88            -0.2       47.65        perf-profile.children.cycles-pp.__libc_pwrite
      3.21            -0.1        3.10        perf-profile.children.cycles-pp.up_write
      2.62            -0.1        2.52        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      2.07            -0.1        2.00        perf-profile.children.cycles-pp.rwsem_wake
      4.34            -0.1        4.27        perf-profile.children.cycles-pp.simple_write_end
      1.28            -0.0        1.24        perf-profile.children.cycles-pp.wake_up_q
      0.69            -0.0        0.66        perf-profile.children.cycles-pp.wake_q_add
      1.06            -0.0        1.03 ±  2%  perf-profile.children.cycles-pp.__mutex_unlock_slowpath
      0.84 ±  2%      -0.0        0.81        perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.76            -0.0        0.73        perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.65            -0.0        0.62        perf-profile.children.cycles-pp.rwsem_mark_wake
      0.81            -0.0        0.79        perf-profile.children.cycles-pp.try_to_wake_up
      5.70            +0.2        5.92        perf-profile.children.cycles-pp.syscall
      2.03 ±  2%      +0.3        2.29        perf-profile.children.cycles-pp.__fdget
      4.83            +0.3        5.10        perf-profile.children.cycles-pp.__x64_sys_fallocate
     10.18            +0.3       10.52        perf-profile.children.cycles-pp.fallocate64
      0.00            +1.1        1.12        perf-profile.children.cycles-pp.x64_sys_call
      2.68            -0.1        2.55        perf-profile.self.cycles-pp.do_syscall_64
      4.24            -0.1        4.13        perf-profile.self.cycles-pp.fault_in_readable
      2.04            -0.0        1.99        perf-profile.self.cycles-pp.simple_write_end
      4.80            -0.0        4.76        perf-profile.self.cycles-pp.vfs_write
      1.67            -0.0        1.64        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.66 ±  2%      -0.0        0.63        perf-profile.self.cycles-pp.wake_q_add
      0.72            -0.0        0.70        perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.68 ±  3%      -0.0        0.66 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      0.50            -0.0        0.48        perf-profile.self.cycles-pp.wake_up_q
      0.59            +0.1        0.66        perf-profile.self.cycles-pp.__x64_sys_fallocate
      0.96            +0.1        1.05        perf-profile.self.cycles-pp.__fdget_pos
      0.57            +0.1        0.68        perf-profile.self.cycles-pp.__x64_sys_pwrite64
      1.87 ±  2%      +0.3        2.14        perf-profile.self.cycles-pp.__fdget
      0.00            +1.0        0.96        perf-profile.self.cycles-pp.x64_sys_call





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linus:master] [x86/syscall] 1e3ad78334: will-it-scale.per_process_ops 1.4% improvement
  2024-04-19  5:49 [linus:master] [x86/syscall] 1e3ad78334: will-it-scale.per_process_ops 1.4% improvement kernel test robot
@ 2024-04-19  7:33 ` Josh Poimboeuf
  2024-04-22  7:41   ` Yujie Liu
  0 siblings, 1 reply; 3+ messages in thread
From: Josh Poimboeuf @ 2024-04-19  7:33 UTC (permalink / raw
  To: kernel test robot
  Cc: Linus Torvalds, oe-lkp, lkp, linux-kernel, Thomas Gleixner,
	Daniel Sneddon, ying.huang, feng.tang, fengwei.yin

On Fri, Apr 19, 2024 at 01:49:26PM +0800, kernel test robot wrote:
> Hi Linus,
> 
> We noticed that commit 1e3ad78334a6 caused performance fluctuations in
> various micro benchmarks. The perf stat metrics related with branch
> instructions do have noticeable changes, which may be an expected
> result of this commit. We are sending this report to provide these data
> and hope it can be helpful for the awareness of overall impact or any
> further investigation. Thanks.
> 
> kernel test robot noticed a 1.4% improvement of will-it-scale.per_process_ops on:
> 
> commit: 1e3ad78334a69b36e107232e337f9d693dcc9df2 ("x86/syscall: Don't force use of indirect calls for system calls")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

Thanks, these are significant regressions.

Since this is on Skylake (with IBRS enabled, presumably) I'd expect that
these regressions are fixed by my "Only harden syscalls when needed"
patch.  I'm planning on posting a new version of that tomorrow, but v3
[*] should be good enough to fix it.  Could you run these tests on the
same Skylake system with my patch added?

Also it would be helpful to see the same tests on Cascade/Ice Lake, or
some other system for which the 'spectre_v2' sysfs vulnerabilities file
shows "BHI: SW loop".  On such a system it shouldn't matter whether my
patch is added as it won't disable Linus' syscall change.  But it would
be very helpful to see the performance impact of that combination.

[*] https://lkml.kernel.org/lkml/eda0ec65f4612cc66875aaf76e738643f41fbc01.1713296762.git.jpoimboe@kernel.org

-- 
Josh

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [linus:master] [x86/syscall] 1e3ad78334: will-it-scale.per_process_ops 1.4% improvement
  2024-04-19  7:33 ` Josh Poimboeuf
@ 2024-04-22  7:41   ` Yujie Liu
  0 siblings, 0 replies; 3+ messages in thread
From: Yujie Liu @ 2024-04-22  7:41 UTC (permalink / raw
  To: Josh Poimboeuf
  Cc: Linus Torvalds, oe-lkp, lkp, linux-kernel, Thomas Gleixner,
	Daniel Sneddon, ying.huang, feng.tang, fengwei.yin

Hi Josh,

On Fri, Apr 19, 2024 at 12:33:46AM -0700, Josh Poimboeuf wrote:
> On Fri, Apr 19, 2024 at 01:49:26PM +0800, kernel test robot wrote:
> > Hi Linus,
> > 
> > We noticed that commit 1e3ad78334a6 caused performance fluctuations in
> > various micro benchmarks. The perf stat metrics related with branch
> > instructions do have noticeable changes, which may be an expected
> > result of this commit. We are sending this report to provide these data
> > and hope it can be helpful for the awareness of overall impact or any
> > further investigation. Thanks.
> > 
> > kernel test robot noticed a 1.4% improvement of will-it-scale.per_process_ops on:
> > 
> > commit: 1e3ad78334a69b36e107232e337f9d693dcc9df2 ("x86/syscall: Don't force use of indirect calls for system calls")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> Thanks, these are significant regressions.

First we need to clarify that by running this specific will-it-scale
futex4 benchmark on a Skylake machine, we observed a +1.4% performance
improvement, not a regression.

> Since this is on Skylake (with IBRS enabled, presumably) I'd expect that
> these regressions are fixed by my "Only harden syscalls when needed"
> patch.  I'm planning on posting a new version of that tomorrow, but v3
> [*] should be good enough to fix it.  Could you run these tests on the
> same Skylake system with my patch added?

The v3 patch [*] cannot be applied on commit 1e3ad78334a6. Seems the
code base has changed a lot, so we are not able to directly compare
1e3ad78334a6 and 1e3ad78334a6+v3_patch.

The patch is good to apply on v6.9-rc4, so we tested v6.9-rc4 and
v6.9-rc4+v3_patch. Here are the test results for your reference:

Skylake
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor:
  lkp-skl-fpga01/will-it-scale/debian-12-x86_64-20240206.cgz/x86_64-rhel-8.3/gcc-13/16/process/futex4/performance

commit:
  0cd01ac5dcb1 ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
  1e3ad78334a6 ("x86/syscall: Don't force use of indirect calls for system calls")
  v6.9-rc4
  v6.9-rc4+v3_patch

    0cd01ac5dcb1                1e3ad78334a6                    v6.9-rc4           v6.9-rc4+v3_patch
---------------- --------------------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \
   1362315            +1.4%    1381406            +1.5%    1382652            +0.5%    1369778        will-it-scale.per_process_ops
  21797058            +1.4%   22102512            +1.5%   22122442            +0.5%   21916453        will-it-scale.workload
      0.04 ±  7%      -7.4%       0.04            -6.1%       0.04 ±  2%      -4.0%       0.04        perf-stat.i.MPKI
  1.98e+09           +19.2%   2.36e+09           +19.2%  2.359e+09            +1.7%  2.014e+09        perf-stat.i.branch-instructions
      1.47            -1.2        0.30            -1.2        0.30 ±  3%      -0.0        1.45        perf-stat.i.branch-miss-rate%
  30820475           -70.4%    9118612           -71.0%    8945551            +0.5%   30985854        perf-stat.i.branch-misses
   7767463            -1.2%    7676829            -1.0%    7686158            -1.3%    7664542        perf-stat.i.cache-references
      3.45            -4.4%       3.30            -4.4%       3.30            -0.4%       3.43        perf-stat.i.cpi
 1.504e+10            +5.1%   1.58e+10            +5.2%  1.582e+10            +1.2%  1.522e+10        perf-stat.i.instructions
      0.29            +4.5%       0.31            +4.6%       0.31            +0.4%       0.29        perf-stat.i.ipc
      1.01 ±100%      -0.6%       1.00 ±100%    +104.1%       2.06            +0.3%       1.01 ±100%  perf-stat.i.metric.K/sec
      0.05 ±  2%      -4.2%       0.04            -3.9%       0.04 ±  2%      +0.4%       0.05        perf-stat.overall.MPKI
      1.56            -1.2        0.39            -1.2        0.38            -0.0        1.54        perf-stat.overall.branch-miss-rate%
      3.43            -4.3%       3.28            -4.4%       3.28            -0.5%       3.41        perf-stat.overall.cpi
      0.29            +4.5%       0.30            +4.6%       0.30            +0.5%       0.29        perf-stat.overall.ipc
    208138            +3.4%     215312            +3.5%     215474            +0.5%     209279        perf-stat.overall.path-length
 1.973e+09           +19.2%  2.353e+09           +19.1%  2.351e+09            +1.8%  2.008e+09        perf-stat.ps.branch-instructions
  30729762           -70.4%    9109071           -71.0%    8918595            +0.6%   30911752        perf-stat.ps.branch-misses
   7745419            -1.1%    7663567            -1.1%    7663740            -1.3%    7647834        perf-stat.ps.cache-references
 1.499e+10            +5.1%  1.575e+10            +5.2%  1.577e+10            +1.2%  1.517e+10        perf-stat.ps.instructions
 4.537e+12            +4.9%  4.759e+12            +5.1%  4.767e+12            +1.1%  4.587e+12        perf-stat.total.instructions
     12.23            -0.6       11.60            -0.6       11.64            -0.0       12.21        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     10.09            -0.6        9.51            -0.5        9.56            -0.1       10.01        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     22.31            -0.4       21.88            -0.4       21.94            +0.0       22.36        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     19.15            +0.2       19.30            +0.2       19.38            -0.1       19.04        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
      9.25            +0.2        9.43            +0.0        9.25            -0.0        9.23        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
      8.79            +0.2        9.02            +0.3        9.07            -0.1        8.72        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      7.13            +0.2        7.36            +0.3        7.41            -0.1        7.07        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
      8.37            +0.3        8.63            +0.3        8.68            -0.1        8.28        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     12.38            -0.6       11.78            -0.5       11.84            -0.0       12.38        perf-profile.children.cycles-pp.do_syscall_64
     10.12            -0.5        9.57            -0.5        9.63            -0.1       10.04        perf-profile.children.cycles-pp.__x64_sys_futex
     22.63            -0.4       22.20            -0.4       22.24            +0.0       22.65        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.48 ±  2%      -0.0        0.46            -0.0        0.47 ±  2%      -0.0        0.46        perf-profile.children.cycles-pp.get_futex_key
     19.34            +0.1       19.49            +0.2       19.57            -0.1       19.24        perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.00            +0.2        0.18 ±  2%      +0.2        0.18 ±  3%      +0.0        0.00        perf-profile.children.cycles-pp.x64_sys_call
      9.11            +0.2        9.29            +0.0        9.12            +0.0        9.12        perf-profile.children.cycles-pp.entry_SYSCALL_64
      8.88            +0.2        9.11            +0.3        9.16            -0.1        8.81        perf-profile.children.cycles-pp.do_futex
      7.13            +0.2        7.36            +0.3        7.41            -0.1        7.07        perf-profile.children.cycles-pp.__futex_wait
      8.43            +0.3        8.70            +0.3        8.75            -0.1        8.34        perf-profile.children.cycles-pp.futex_wait
      1.20            -0.7        0.47            -0.7        0.46 ±  3%      -0.0        1.20 ±  2%  perf-profile.self.cycles-pp.__x64_sys_futex
      1.46            -0.2        1.27            -0.2        1.26 ±  2%      +0.0        1.48 ±  2%  perf-profile.self.cycles-pp.do_syscall_64
      0.51            -0.1        0.44            -0.1        0.45 ±  2%      +0.0        0.52        perf-profile.self.cycles-pp.do_futex
      0.38 ±  5%      -0.1        0.32 ±  4%      -0.1        0.32 ±  5%      +0.0        0.39 ±  7%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.48 ±  2%      -0.0        0.45            -0.0        0.45 ±  2%      -0.0        0.46 ±  2%  perf-profile.self.cycles-pp.get_futex_key
      1.21            +0.0        1.24 ±  2%      +0.0        1.23 ±  3%      -0.0        1.18        perf-profile.self.cycles-pp.futex_wait
      0.09 ± 14%      +0.0        0.12 ±  8%      +0.0        0.13 ±  6%      +0.0        0.12 ±  5%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      0.00            +0.1        0.15 ±  2%      +0.2        0.15 ±  3%      +0.0        0.00        perf-profile.self.cycles-pp.x64_sys_call
      7.97            +0.1        8.12            -0.0        7.95            -0.0        7.96        perf-profile.self.cycles-pp.entry_SYSCALL_64
     19.28            +0.2       19.44            +0.2       19.53            -0.1       19.21        perf-profile.self.cycles-pp.syscall_return_via_sysret
     10.43            +0.2       10.60            +0.2       10.59            +0.0       10.46        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.72 ±  3%      +0.2        0.94 ±  3%      +0.2        0.93 ±  4%      +0.0        0.74        perf-profile.self.cycles-pp.__futex_wait

> Also it would be helpful to see the same tests on Cascade/Ice Lake, or
> some other system for which the 'spectre_v2' sysfs vulnerabilities file
> shows "BHI: SW loop".  On such a system it shouldn't matter whether my
> patch is added as it won't disable Linus' syscall change.  But it would
> be very helpful to see the performance impact of that combination.

The test results on Cascade/Ice Lake are as follows:

Intel Xeon Platinum 8260L (Cascade Lake)
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor:
  lkp-csl-2sp3/will-it-scale/debian-12-x86_64-20240206.cgz/x86_64-rhel-8.3/gcc-13/16/process/futex4/performance

commit:
  0cd01ac5dcb1 ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")
  1e3ad78334a6 ("x86/syscall: Don't force use of indirect calls for system calls")
  v6.9-rc4
  v6.9-rc4+v3_patch

    0cd01ac5dcb1                1e3ad78334a6                    v6.9-rc4           v6.9-rc4+v3_patch
---------------- --------------------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \
   3237910            -0.3%    3229309           -10.1%    2911031           -11.0%    2882769        will-it-scale.per_process_ops
  51806565            -0.3%   51668961           -10.1%   46576504           -11.0%   46124311        will-it-scale.workload
      0.02 ±  7%      -6.4%       0.02 ±  3%      -9.5%       0.02 ±  2%      -2.4%       0.02 ± 12%  perf-stat.i.MPKI
 4.649e+09           +17.4%  5.459e+09           +76.1%  8.186e+09           +75.4%  8.156e+09        perf-stat.i.branch-instructions
      0.72            -0.6        0.15 ±  4%      -0.6        0.12            -0.6        0.12 ±  2%  perf-stat.i.branch-miss-rate%
  34188248           -74.0%    8872232 ±  3%     -69.9%   10285664           -70.0%   10244122        perf-stat.i.branch-misses
      1.70            -4.2%       1.63            -8.0%       1.56            -8.3%       1.56        perf-stat.i.cpi
 3.326e+10            +3.6%  3.444e+10            +9.1%  3.628e+10            +8.2%  3.599e+10        perf-stat.i.instructions
      0.59            +4.3%       0.61            +8.7%       0.64            +9.0%       0.64        perf-stat.i.ipc
      0.18 ± 16%     -11.5%       0.16 ± 22%     -33.9%       0.12 ± 46%     -58.6%       0.08 ± 49%  perf-stat.i.major-faults
      0.02 ±  7%      -6.3%       0.02 ±  4%     -11.0%       0.02 ±  3%      -2.3%       0.02 ± 13%  perf-stat.overall.MPKI
      0.74            -0.6        0.16 ±  3%      -0.6        0.13            -0.6        0.13        perf-stat.overall.branch-miss-rate%
      1.70            -4.1%       1.63            -8.0%       1.56            -8.3%       1.56        perf-stat.overall.cpi
      0.59            +4.3%       0.61            +8.7%       0.64            +9.0%       0.64        perf-stat.overall.ipc
    193210            +3.9%     200708           +21.4%     234594           +21.5%     234812        perf-stat.overall.path-length
 4.633e+09           +17.4%  5.441e+09           +76.1%  8.159e+09           +75.4%  8.129e+09        perf-stat.ps.branch-instructions
  34084869           -74.0%    8860998 ±  2%     -69.9%   10274305           -70.0%   10220106        perf-stat.ps.branch-misses
 3.315e+10            +3.6%  3.433e+10            +9.1%  3.616e+10            +8.2%  3.587e+10        perf-stat.ps.instructions
      0.18 ± 16%     -11.5%       0.16 ± 22%     -33.8%       0.12 ± 46%     -58.6%       0.08 ± 49%  perf-stat.ps.major-faults
 1.001e+13            +3.6%  1.037e+13            +9.2%  1.093e+13            +8.2%  1.083e+13        perf-stat.total.instructions
     18.55            -0.3       18.23            -1.1       17.45            -1.1       17.46        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      1.82            -0.1        1.74            -0.2        1.60            -0.2        1.57        perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      3.58            -0.1        3.51            -0.5        3.11            -0.5        3.11        perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait.do_futex
     17.39            -0.1       17.32            -1.1       16.32            -1.1       16.30        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      0.68 ±  2%      -0.0        0.66            -0.1        0.60 ±  2%      -0.1        0.60 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      2.73            -0.0        2.72            -0.3        2.40            -0.3        2.40        perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait
      0.60 ±  2%      +0.0        0.60 ±  2%      -0.0        0.57            -0.0        0.59 ±  2%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      0.00            +0.0        0.00            +6.3        6.26            +6.2        6.22        perf-profile.calltrace.cycles-pp.clear_bhb_loop.syscall
      0.61 ±  2%      +0.0        0.61 ±  2%      -0.0        0.58            -0.0        0.60 ±  2%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
     72.70            +0.0       72.72            +0.1       72.78            +0.1       72.80        perf-profile.calltrace.cycles-pp.syscall
      1.78            +0.0        1.80            -0.2        1.59            -0.2        1.58        perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
     16.67            +0.0       16.71            -1.1       15.61            -1.1       15.59        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     20.50            +0.0       20.55            -0.5       20.04            -0.5       20.01        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     21.76            +0.0       21.81            -0.5       21.22            -0.5       21.24        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
      2.07            +0.1        2.13            -0.2        1.90            -0.2        1.91        perf-profile.calltrace.cycles-pp.futex_hash.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
      0.76            +0.1        0.84 ±  3%      -0.0        0.74            -0.0        0.74        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      5.09            +0.1        5.17            -0.5        4.60            -0.5        4.62        perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      7.85            +0.1        7.94            -0.7        7.10            -0.8        7.07        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
      0.91 ±  2%      +0.1        1.04            +0.0        0.92            +0.0        0.92        perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
     39.86            +0.2       40.02            -4.4       35.46            -4.4       35.48        perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
     14.13            +0.2       14.35            -0.9       13.20            -1.0       13.18        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     12.44            +0.3       12.70            -1.1       11.33            -1.1       11.34        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
     18.62            -0.3       18.30            -1.1       17.57            -1.0       17.58        perf-profile.children.cycles-pp.__x64_sys_futex
     17.59            -0.1       17.46            -1.0       16.54            -1.1       16.52        perf-profile.children.cycles-pp.do_futex
      1.82            -0.1        1.74            -0.2        1.60            -0.2        1.57        perf-profile.children.cycles-pp.futex_q_unlock
      3.19            -0.1        3.13            -0.4        2.77            -0.4        2.77        perf-profile.children.cycles-pp.__get_user_nocheck_4
      0.68 ±  2%      -0.0        0.66            -0.1        0.60 ±  2%      -0.1        0.60 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      3.58            -0.0        3.57            -0.4        3.16            -0.4        3.16        perf-profile.children.cycles-pp.futex_get_value_locked
      0.80            -0.0        0.79 ±  4%      -0.0        0.76 ±  2%      -0.1        0.75 ±  3%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.81            -0.0        0.80 ±  4%      -0.0        0.77 ±  2%      -0.0        0.76 ±  3%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.46 ±  2%      -0.0        0.45 ±  5%      -0.0        0.43 ±  2%      -0.0        0.42 ±  4%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.35            -0.0        0.34            -0.0        0.30 ±  3%      -0.0        0.30 ±  2%  perf-profile.children.cycles-pp.testcase
      0.66            -0.0        0.65 ±  4%      -0.0        0.62            -0.0        0.62 ±  2%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.38 ±  2%      -0.0        0.38 ±  5%      -0.0        0.36            -0.0        0.35 ±  5%  perf-profile.children.cycles-pp.update_process_times
      0.14 ±  5%      -0.0        0.14 ±  5%      -0.0        0.12 ±  4%      -0.0        0.12 ±  5%  perf-profile.children.cycles-pp.amd_clear_divider
      0.00            +0.0        0.00            +6.3        6.32            +6.3        6.29        perf-profile.children.cycles-pp.clear_bhb_loop
      0.28 ±  6%      +0.0        0.28 ±  4%      -0.0        0.25 ±  3%      -0.0        0.24 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.90            +0.0        0.91 ±  3%      -0.1        0.80            -0.1        0.81        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      1.88            +0.0        1.90 ±  2%      -0.2        1.69            -0.2        1.68        perf-profile.children.cycles-pp._raw_spin_lock
      0.17 ±  4%      +0.0        0.20 ±  2%      -0.1        0.12 ±  3%      -0.1        0.12 ±  7%  perf-profile.children.cycles-pp.futex_setup_timer
     20.64            +0.0       20.69            -0.5       20.18            -0.4       20.21        perf-profile.children.cycles-pp.do_syscall_64
     21.90            +0.0       21.94            -0.5       21.41            -0.5       21.43        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      2.07            +0.1        2.13            -0.2        1.90            -0.2        1.91        perf-profile.children.cycles-pp.futex_hash
      5.13            +0.1        5.20            -0.5        4.64            -0.5        4.64        perf-profile.children.cycles-pp.entry_SYSCALL_64
     16.84            +0.1       16.91            -1.1       15.73            -1.1       15.71        perf-profile.children.cycles-pp.futex_wait
      5.30            +0.1        5.37            -0.5        4.79            -0.5        4.80        perf-profile.children.cycles-pp.futex_q_lock
      0.91 ±  2%      +0.1        1.05            +0.0        0.92            +0.0        0.92        perf-profile.children.cycles-pp.get_futex_key
     42.79            +0.2       42.98            -4.6       38.15            -4.6       38.18        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
     14.13            +0.2       14.36            -0.9       13.20            -1.0       13.18        perf-profile.children.cycles-pp.__futex_wait
     12.58            +0.3       12.83            -1.1       11.44            -1.1       11.44        perf-profile.children.cycles-pp.futex_wait_setup
      0.00            +0.4        0.41 ±  2%      +0.6        0.56 ±  2%      +0.6        0.58 ±  3%  perf-profile.children.cycles-pp.x64_sys_call
      4.04            -0.3        3.77            -0.7        3.36            -0.7        3.33        perf-profile.self.cycles-pp.syscall
      1.03 ±  2%      -0.3        0.76 ±  2%      -0.0        1.00            -0.0        1.02        perf-profile.self.cycles-pp.__x64_sys_futex
      0.88            -0.3        0.62            +0.0        0.91            +0.0        0.91        perf-profile.self.cycles-pp.do_futex
      2.50            -0.1        2.42            -0.2        2.30            -0.2        2.31        perf-profile.self.cycles-pp.futex_wait
      1.74            -0.1        1.68            -0.2        1.55            -0.2        1.52        perf-profile.self.cycles-pp.futex_q_unlock
      3.18            -0.1        3.12            -0.4        2.76            -0.4        2.76        perf-profile.self.cycles-pp.__get_user_nocheck_4
      0.54            -0.1        0.48 ±  3%      -0.1        0.43            -0.1        0.44 ±  3%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      1.48            -0.0        1.45 ±  2%      +0.2        1.69 ±  2%      +0.2        1.67        perf-profile.self.cycles-pp.__futex_wait
      0.68 ±  2%      -0.0        0.66            -0.1        0.60 ±  2%      -0.1        0.60 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.35            -0.0        0.34            -0.0        0.30 ±  3%      -0.0        0.30 ±  2%  perf-profile.self.cycles-pp.testcase
      0.00            +0.0        0.00            +6.3        6.26            +6.2        6.22        perf-profile.self.cycles-pp.clear_bhb_loop
      1.33            +0.0        1.33            -0.1        1.23            -0.1        1.22        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.44            +0.0        1.44            -0.1        1.30            -0.1        1.30        perf-profile.self.cycles-pp.futex_q_lock
      1.03            +0.0        1.05 ±  2%      +0.2        1.22            +0.3        1.28        perf-profile.self.cycles-pp.do_syscall_64
      1.80            +0.0        1.84 ±  2%      -0.2        1.62            -0.2        1.62        perf-profile.self.cycles-pp._raw_spin_lock
      2.42            +0.0        2.46            -0.2        2.19            -0.2        2.20        perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.38 ±  6%      +0.1        0.44            +0.0        0.38            -0.0        0.38 ±  3%  perf-profile.self.cycles-pp.futex_get_value_locked
      2.00            +0.1        2.06            -0.2        1.83            -0.2        1.84        perf-profile.self.cycles-pp.futex_hash
      0.21 ±  6%      +0.1        0.28 ±  4%      +0.0        0.25 ±  3%      +0.0        0.24 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      1.11 ±  3%      +0.1        1.22 ±  3%      -0.0        1.08 ±  2%      -0.0        1.10        perf-profile.self.cycles-pp.futex_wait_setup
      0.90 ±  2%      +0.1        1.04            +0.0        0.92            +0.0        0.92 ±  2%  perf-profile.self.cycles-pp.get_futex_key
     42.61            +0.2       42.81            -4.6       38.00            -4.6       38.02        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.00            +0.4        0.41 ±  2%      +0.6        0.55 ±  2%      +0.5        0.52 ±  3%  perf-profile.self.cycles-pp.x64_sys_call


Intel Xeon Gold 6346 (Ice Lake)
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/nr_task/mode/test/cpufreq_governor:
  lkp-icl-2sp9/will-it-scale/debian-12-x86_64-20240206.cgz/x86_64-rhel-8.3/gcc-13/16/process/futex4/performance

commit:                                                                                                                   0cd01ac5dcb1 ("x86/bugs: Change commas to semicolons in 'spectre_v2' sysfs file")                                       1e3ad78334a6 ("x86/syscall: Don't force use of indirect calls for system calls")                                        v6.9-rc4                                                                                                                v6.9-rc4+v3_patch

    0cd01ac5dcb1                1e3ad78334a6                    v6.9-rc4           v6.9-rc4+v3_patch
---------------- --------------------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \
   7907214            -1.8%    7763496           -15.4%    6686457           -15.5%    6678350        will-it-scale.per_process_ops
 1.265e+08            -1.8%  1.242e+08           -15.4%   1.07e+08           -15.5%  1.069e+08        will-it-scale.workload
 1.112e+10           +16.0%   1.29e+10           +67.3%   1.86e+10           +68.0%  1.868e+10        perf-stat.i.branch-instructions
      0.06 ±  2%      -0.0        0.06 ±  2%      -0.0        0.05            -0.0        0.05        perf-stat.i.branch-miss-rate%
   6858604 ±  2%      +0.6%    6900573 ±  2%      +8.4%    7434422            +7.7%    7388238        perf-stat.i.branch-misses
      0.72            -2.0%       0.71            -2.7%       0.70            -2.7%       0.70        perf-stat.i.cpi
 8.004e+10            +2.1%   8.17e+10            +2.8%  8.231e+10            +2.8%  8.232e+10        perf-stat.i.instructions
      1.38            +2.1%       1.41            +2.8%       1.42            +2.8%       1.42        perf-stat.i.ipc
      0.06 ±  2%      -0.0        0.05 ±  2%      -0.0        0.04            -0.0        0.04        perf-stat.overall.branch-miss-rate%
      0.72            -2.0%       0.71            -2.8%       0.70            -2.8%       0.70        perf-stat.overall.cpi
      1.38            +2.1%       1.41            +2.8%       1.42            +2.8%       1.42        perf-stat.overall.ipc
    190470            +3.9%     197929           +21.7%     231786           +21.8%     231973        perf-stat.overall.path-length
 1.108e+10           +16.0%  1.286e+10           +67.3%  1.854e+10           +68.0%  1.862e+10        perf-stat.ps.branch-instructions
   6893534 ±  2%      +0.5%    6924998 ±  2%      +8.3%    7462919            +7.5%    7410265        perf-stat.ps.branch-misses
 7.978e+10            +2.1%  8.143e+10            +2.8%  8.204e+10            +2.8%  8.205e+10        perf-stat.ps.instructions
  2.41e+13            +2.0%  2.459e+13            +2.9%   2.48e+13            +2.9%  2.479e+13        perf-stat.total.instructions
     48.06            -2.8       45.31            -9.9       38.20 ±  3%      -9.1       38.94        perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     42.93            -2.5       40.41            -8.8       34.12 ±  4%      -8.1       34.84        perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
     56.45            -2.4       54.10           -12.0       44.44           -11.4       45.05        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     58.78            -2.3       56.48           -12.5       46.31           -11.9       46.86        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     61.14            -2.3       58.86           -12.9       48.20           -12.5       48.67        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     71.08            -1.4       69.71           -12.1       58.96           -12.1       58.95        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
     68.07            -1.2       66.88           -11.8       56.28           -11.8       56.26        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     17.13            -1.1       16.06            -3.8       13.38 ±  7%      -2.9       14.20 ±  5%  perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
    100.03            -0.8       99.22            -1.0       99.01            -1.4       98.60        perf-profile.calltrace.cycles-pp.syscall
     15.22            -0.8       14.42            -2.7       12.53 ±  5%      -2.7       12.52        perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait.do_futex
     12.02            -0.6       11.37            -2.1        9.89 ±  6%      -2.1        9.92        perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait
      3.12 ±  9%      -0.5        2.61            -0.9        2.22 ± 10%      -1.0        2.08 ±  7%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      7.38 ±  2%      -0.4        6.99            -1.9        5.44 ±  5%      -1.6        5.76 ±  5%  perf-profile.calltrace.cycles-pp.futex_hash.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
      5.12            -0.3        4.78            -0.9        4.20 ± 10%      -0.6        4.47 ±  5%  perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
      4.99 ±  2%      -0.3        4.66            -1.0        4.00 ±  5%      -1.2        3.79 ±  6%  perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      3.02 ±  3%      -0.2        2.80            -0.9        2.17            -0.8        2.25 ±  3%  perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
      1.58 ±  3%      -0.1        1.51            -0.3        1.29 ± 10%      -0.4        1.22 ±  7%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.syscall
      0.94 ±  3%      -0.1        0.87            -0.2        0.75 ±  9%      -0.2        0.70 ±  8%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      0.69 ±  3%      -0.0        0.65            -0.2        0.47 ± 46%      -0.3        0.36 ± 71%  perf-profile.calltrace.cycles-pp.amd_clear_divider.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      0.68 ±  3%      -0.0        0.66            -0.3        0.38 ± 71%      -0.2        0.45 ± 45%  perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.syscall
      0.00            +0.0        0.00           +15.2       15.21 ±  2%     +14.8       14.83        perf-profile.calltrace.cycles-pp.clear_bhb_loop.syscall
      1.04 ±  2%      +0.1        1.13 ±  2%      -0.1        0.96 ±  3%      -0.0        1.00 ±  5%  perf-profile.calltrace.cycles-pp.testcase
      1.57            +0.1        1.70            -0.1        1.48 ±  9%      -0.0        1.54 ±  5%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.syscall
      0.00            +1.3        1.29            +1.6        1.62 ±  8%      +1.3        1.34 ±  4%  perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
     16.18            +1.6       17.80            -0.6       15.63 ±  7%      +0.0       16.22 ±  6%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.syscall
     48.52            -2.8       45.75           -10.0       38.57 ±  2%      -9.2       39.30        perf-profile.children.cycles-pp.__futex_wait
     44.20            -2.6       41.60            -9.1       35.12 ±  3%      -8.4       35.84        perf-profile.children.cycles-pp.futex_wait_setup
     57.11            -2.4       54.75           -12.1       44.98           -11.5       45.58        perf-profile.children.cycles-pp.futex_wait
     59.22            -2.3       56.91           -12.7       46.54           -12.1       47.10        perf-profile.children.cycles-pp.do_futex
     61.79            -2.3       59.51           -13.0       48.74           -12.6       49.19        perf-profile.children.cycles-pp.__x64_sys_futex
     69.05            -1.5       67.59           -12.1       56.90           -12.2       56.85        perf-profile.children.cycles-pp.do_syscall_64
     71.36            -1.4       70.00           -11.9       59.44           -11.9       59.43        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     17.82            -1.1       16.70            -3.9       13.94 ±  7%      -3.0       14.79 ±  5%  perf-profile.children.cycles-pp.futex_q_lock
     14.54            -0.8       13.76            -2.6       11.96 ±  5%      -2.6       11.98        perf-profile.children.cycles-pp.futex_get_value_locked
     13.16            -0.7       12.46            -2.3       10.83 ±  5%      -2.3       10.84        perf-profile.children.cycles-pp.__get_user_nocheck_4
      3.96 ±  3%      -0.5        3.46            -1.0        2.95 ± 10%      -1.2        2.78 ±  7%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      7.61 ±  2%      -0.4        7.21            -2.0        5.62 ±  4%      -1.6        5.96 ±  5%  perf-profile.children.cycles-pp.futex_hash
      5.35            -0.4        5.00            -1.0        4.40 ± 10%      -0.7        4.67 ±  5%  perf-profile.children.cycles-pp._raw_spin_lock
      5.22 ±  2%      -0.3        4.88            -1.0        4.19 ±  5%      -1.3        3.97 ±  6%  perf-profile.children.cycles-pp.futex_q_unlock
      3.26 ±  3%      -0.2        3.02            -0.9        2.35 ±  2%      -0.8        2.44 ±  3%  perf-profile.children.cycles-pp.get_futex_key
     98.50            -0.1       98.40            +0.1       98.60            +0.1       98.55        perf-profile.children.cycles-pp.syscall
      1.81 ±  3%      -0.1        1.73            -0.3        1.47 ± 10%      -0.4        1.40 ±  7%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      1.17 ±  3%      -0.1        1.09            -0.2        0.94 ±  9%      -0.3        0.87 ±  8%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.93 ±  3%      -0.1        0.86 ±  2%      -0.2        0.73 ± 10%      -0.2        0.70 ±  8%  perf-profile.children.cycles-pp.amd_clear_divider
      0.16 ±  2%      -0.0        0.15            -0.0        0.15 ±  5%      -0.0        0.15 ±  4%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.16 ±  3%      -0.0        0.14 ±  3%      -0.0        0.14 ±  3%      -0.0        0.14 ±  5%  perf-profile.children.cycles-pp.hrtimer_interrupt
      9.15 ±  2%      -0.0        9.14            -1.5        7.65 ±  7%      -1.6        7.54 ±  3%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.65 ±  4%      -0.0        0.64            -0.1        0.54 ±  7%      -0.1        0.54 ±  2%  perf-profile.children.cycles-pp.futex_setup_timer
      0.00            +0.0        0.00           +15.4       15.40 ±  2%     +15.0       15.01        perf-profile.children.cycles-pp.clear_bhb_loop
      0.62 ±  2%      +0.0        0.66 ±  2%      -0.1        0.56 ±  5%      -0.0        0.58 ±  5%  perf-profile.children.cycles-pp.syscall@plt
      1.45            +0.1        1.57            -0.1        1.33 ±  2%      -0.1        1.39 ±  5%  perf-profile.children.cycles-pp.testcase
      1.59            +0.1        1.72            -0.1        1.49 ±  9%      -0.0        1.56 ±  5%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      9.45            +0.9       10.37            -0.3        9.12 ±  7%      +0.0        9.45 ±  5%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.00            +1.5        1.51            +1.8        1.81 ±  8%      +1.5        1.51 ±  4%  perf-profile.children.cycles-pp.x64_sys_call
     12.90            -0.7       12.23            -2.3       10.63 ±  5%      -2.3       10.64        perf-profile.self.cycles-pp.__get_user_nocheck_4
      7.38 ±  2%      -0.4        6.99            -1.9        5.43 ±  5%      -1.6        5.76 ±  5%  perf-profile.self.cycles-pp.futex_hash
      5.09            -0.4        4.70            -1.0        4.11 ±  8%      -0.7        4.35 ±  6%  perf-profile.self.cycles-pp.futex_q_lock
      5.11            -0.3        4.78            -0.9        4.21 ± 10%      -0.6        4.47 ±  5%  perf-profile.self.cycles-pp._raw_spin_lock
      4.86 ±  2%      -0.3        4.54            -0.9        3.92 ±  5%      -1.1        3.71 ±  7%  perf-profile.self.cycles-pp.futex_q_unlock
      3.68            -0.2        3.45            -0.1        3.62 ±  9%      -0.1        3.56 ±  7%  perf-profile.self.cycles-pp.do_syscall_64
      3.02 ±  3%      -0.2        2.80            -0.9        2.16 ±  2%      -0.8        2.25 ±  3%  perf-profile.self.cycles-pp.get_futex_key
      4.33 ±  3%      -0.2        4.14            -0.9        3.45 ±  6%      -0.9        3.45 ±  2%  perf-profile.self.cycles-pp.__futex_wait
      4.17            -0.2        3.99 ±  3%      -0.9        3.32            -0.9        3.32        perf-profile.self.cycles-pp.futex_wait_setup
      2.09 ±  3%      -0.2        1.93            -0.4        1.64 ±  9%      -0.5        1.56 ±  7%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      1.39 ±  2%      -0.1        1.29            -0.3        1.13 ±  2%      -0.3        1.12 ±  2%  perf-profile.self.cycles-pp.futex_get_value_locked
      1.81 ±  3%      -0.1        1.73            -0.3        1.47 ± 10%      -0.4        1.40 ±  7%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.94 ±  3%      -0.1        0.88            -0.2        0.75 ± 10%      -0.2        0.70 ±  8%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      0.46 ±  2%      -0.0        0.43 ±  2%      -0.1        0.37 ± 11%      -0.1        0.34 ±  6%  perf-profile.self.cycles-pp.amd_clear_divider
      0.43 ±  4%      +0.0        0.43 ±  2%      -0.1        0.36 ±  7%      -0.1        0.36 ±  2%  perf-profile.self.cycles-pp.futex_setup_timer
      0.00            +0.0        0.00           +15.2       15.22 ±  2%     +14.8       14.81        perf-profile.self.cycles-pp.clear_bhb_loop
      8.92 ±  2%      +0.0        8.92            -1.4        7.47 ±  7%      -1.6        7.36 ±  2%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.20            +0.0        0.22 ±  3%      -0.0        0.18 ±  8%      -0.0        0.20 ±  7%  perf-profile.self.cycles-pp.syscall@plt
      2.76            +0.0        2.81            -0.4        2.38 ± 10%      -0.5        2.27 ±  7%  perf-profile.self.cycles-pp.__x64_sys_futex
      1.90 ±  3%      +0.0        1.95            -0.7        1.23 ± 11%      -0.7        1.22 ±  8%  perf-profile.self.cycles-pp.do_futex
      2.50 ±  2%      +0.1        2.61            +0.1        2.56 ±  8%      +0.1        2.60 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.24            +0.1        1.35 ±  2%      -0.1        1.14 ±  3%      -0.0        1.19 ±  5%  perf-profile.self.cycles-pp.testcase
      1.59            +0.1        1.72            -0.1        1.49 ±  9%      -0.0        1.56 ±  5%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      2.72            +0.2        2.94 ±  2%      -0.1        2.61 ±  9%      -0.0        2.69 ±  6%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      8.15 ±  3%      +0.4        8.56            -2.0        6.18 ± 10%      -2.1        6.04 ±  3%  perf-profile.self.cycles-pp.futex_wait
     11.95            +1.0       12.92            -1.0       10.97 ±  4%      -0.6       11.35 ±  4%  perf-profile.self.cycles-pp.syscall
      0.00            +1.3        1.30            +1.6        1.62 ±  8%      +1.3        1.33 ±  4%  perf-profile.self.cycles-pp.x64_sys_call


BTW, we did observe some regressions by running other benchmarks on
commit 1e3ad78334a6, but these regressions are on Ice Lake, not Skylake.
Please kindly contact us if you are interested in looking into them.

    stress-ng.null.ops_per_sec -4.0% regression on Intel Xeon Gold 6346 (Ice Lake)
    unixbench.fsbuffer.throughput -1.4% regression on Intel Xeon Gold 6346 (Ice Lake)

Thanks,
Yujie

> 
> [*] https://lkml.kernel.org/lkml/eda0ec65f4612cc66875aaf76e738643f41fbc01.1713296762.git.jpoimboe@kernel.org
> 
> -- 
> Josh
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-04-22  7:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-19  5:49 [linus:master] [x86/syscall] 1e3ad78334: will-it-scale.per_process_ops 1.4% improvement kernel test robot
2024-04-19  7:33 ` Josh Poimboeuf
2024-04-22  7:41   ` Yujie Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.