From: kernel test robot <oliver.sang@intel.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>, <oliver.sang@intel.com>
Subject: [amir73il:fsnotify-sbconn] [fsnotify] 629f30e073: unixbench.throughput 5.8% improvement
Date: Thu, 14 Mar 2024 16:23:10 +0800 [thread overview]
Message-ID: <202403141505.807a722b-oliver.sang@intel.com> (raw)
hi, Amir Goldstein,
while for report
"[amir73il:sb_write_barrier] [fsnotify] 9c13a708ef: will-it-scale.per_thread_ops -4.2% regression"
by our tests, "fsnotify: optimize the case of no permission event watchers"
caused some performance improvements in will-it-scale tests.
(https://lore.kernel.org/all/Zc7KmlQ1cYVrPMQ+@xsang-OptiPlex-9020/)
we report below just FYI what its behavior on another branch 'fsnotify-sbconn'
with unixbench tests.
Hello,
kernel test robot noticed a 5.8% improvement of unixbench.throughput on:
commit: 629f30e073f457c2209aa2517dd653fe7da9bfd6 ("fsnotify: optimize the case of no permission event watchers")
https://github.com/amir73il/linux fsnotify-sbconn
testcase: unixbench
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
runtime: 300s
nr_task: 100%
test: fstime-r
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240314/202403141505.807a722b-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/300s/lkp-icl-2sp9/fstime-r/unixbench
commit:
20e86594eb ("fsnotify: use an enum for group priority constants")
629f30e073 ("fsnotify: optimize the case of no permission event watchers")
20e86594ebade045 629f30e073f457c2209aa2517dd
---------------- ---------------------------
%stddev %change %stddev
\ | \
3568 -1.7% 3507 proc-vmstat.nr_page_table_pages
1.334e+08 +5.8% 1.412e+08 unixbench.throughput
1648 +5.3% 1735 unixbench.time.user_time
5.74e+10 +5.9% 6.082e+10 unixbench.workload
3.28 +5.2% 3.45 perf-stat.i.MPKI
2.653e+10 +1.6% 2.695e+10 perf-stat.i.branch-instructions
32649261 -7.4% 30238184 perf-stat.i.branch-misses
5.186e+08 +6.1% 5.504e+08 perf-stat.i.cache-misses
1.655e+09 +5.3% 1.743e+09 perf-stat.i.cache-references
472.59 ± 2% -4.8% 450.03 ± 2% perf-stat.i.cycles-between-cache-misses
3.69 +6.0% 3.91 perf-stat.overall.MPKI
0.12 -0.0 0.11 perf-stat.overall.branch-miss-rate%
31.34 +0.2 31.58 perf-stat.overall.cache-miss-rate%
303.09 -6.3% 284.06 perf-stat.overall.cycles-between-cache-misses
1054 -5.5% 997.35 perf-stat.overall.path-length
2.65e+10 +1.6% 2.693e+10 perf-stat.ps.branch-instructions
32593661 -7.3% 30202456 perf-stat.ps.branch-misses
5.18e+08 +6.2% 5.5e+08 perf-stat.ps.cache-misses
1.653e+09 +5.3% 1.741e+09 perf-stat.ps.cache-references
8.43 -4.0 4.42 perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
79.00 -1.0 77.96 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
82.99 -0.8 82.14 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
86.80 -0.7 86.12 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
88.17 -0.6 87.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
4.90 ± 2% -0.3 4.55 ± 2% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
98.46 -0.1 98.37 perf-profile.calltrace.cycles-pp.read
0.72 +0.1 0.78 perf-profile.calltrace.cycles-pp.aa_file_perm.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read
0.84 +0.1 0.89 perf-profile.calltrace.cycles-pp.r_test
1.37 +0.1 1.44 perf-profile.calltrace.cycles-pp.xas_descend.xas_load.filemap_get_read_batch.filemap_get_pages.filemap_read
3.18 +0.1 3.31 perf-profile.calltrace.cycles-pp.xas_load.filemap_get_read_batch.filemap_get_pages.filemap_read.vfs_read
3.85 +0.2 4.02 perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_read.vfs_read.ksys_read
4.49 +0.2 4.69 perf-profile.calltrace.cycles-pp.touch_atime.filemap_read.vfs_read.ksys_read.do_syscall_64
8.11 +0.5 8.57 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read
9.90 ± 2% +0.6 10.51 perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.vfs_read.ksys_read
12.29 +0.8 13.07 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64
28.37 +1.8 30.19 perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read.ksys_read
29.72 +1.8 31.56 perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.vfs_read.ksys_read.do_syscall_64
57.38 +3.2 60.61 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.64 -4.1 4.53 perf-profile.children.cycles-pp.__fsnotify_parent
79.41 -1.0 78.38 perf-profile.children.cycles-pp.vfs_read
83.38 -0.8 82.55 perf-profile.children.cycles-pp.ksys_read
87.31 -0.7 86.65 perf-profile.children.cycles-pp.do_syscall_64
88.36 -0.6 87.76 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
5.10 ± 2% -0.3 4.76 perf-profile.children.cycles-pp.rw_verify_area
98.75 -0.1 98.68 perf-profile.children.cycles-pp.read
0.33 +0.0 0.35 perf-profile.children.cycles-pp.__x64_sys_read
0.39 +0.0 0.41 perf-profile.children.cycles-pp.make_vfsgid
0.39 +0.0 0.41 perf-profile.children.cycles-pp.make_vfsuid
0.54 +0.0 0.57 perf-profile.children.cycles-pp.read@plt
0.34 +0.0 0.37 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.30 ± 3% +0.0 0.33 ± 4% perf-profile.children.cycles-pp.update_process_times
0.84 ± 2% +0.0 0.88 perf-profile.children.cycles-pp.generic_file_read_iter
0.82 +0.1 0.88 perf-profile.children.cycles-pp.aa_file_perm
1.75 ± 2% +0.1 1.81 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
1.25 +0.1 1.33 perf-profile.children.cycles-pp.r_test
1.57 +0.1 1.66 perf-profile.children.cycles-pp.xas_descend
3.54 +0.1 3.66 ± 2% perf-profile.children.cycles-pp.security_file_permission
3.59 +0.1 3.74 perf-profile.children.cycles-pp.xas_load
3.60 +0.2 3.79 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
4.24 +0.2 4.43 perf-profile.children.cycles-pp.atime_needs_update
4.69 +0.2 4.90 perf-profile.children.cycles-pp.touch_atime
4.70 +0.3 4.96 perf-profile.children.cycles-pp.entry_SYSCALL_64
10.20 ± 2% +0.6 10.83 perf-profile.children.cycles-pp.filemap_get_read_batch
12.50 +0.8 13.28 perf-profile.children.cycles-pp.filemap_get_pages
28.48 +1.8 30.31 perf-profile.children.cycles-pp._copy_to_iter
29.94 +1.9 31.80 perf-profile.children.cycles-pp.copy_page_to_iter
57.92 +3.3 61.17 perf-profile.children.cycles-pp.filemap_read
8.38 -4.0 4.40 perf-profile.self.cycles-pp.__fsnotify_parent
1.54 ± 7% -0.5 1.09 ± 4% perf-profile.self.cycles-pp.rw_verify_area
0.20 +0.0 0.21 perf-profile.self.cycles-pp.__x64_sys_read
0.29 +0.0 0.31 perf-profile.self.cycles-pp.make_vfsgid
0.29 +0.0 0.31 perf-profile.self.cycles-pp.make_vfsuid
0.43 +0.0 0.46 perf-profile.self.cycles-pp.touch_atime
0.34 +0.0 0.36 perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.74 ± 2% +0.0 0.77 perf-profile.self.cycles-pp.generic_file_read_iter
0.70 +0.0 0.75 perf-profile.self.cycles-pp.aa_file_perm
1.35 +0.0 1.40 perf-profile.self.cycles-pp.current_time
0.88 +0.0 0.94 perf-profile.self.cycles-pp.security_file_permission
1.14 +0.1 1.20 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
1.30 +0.1 1.37 perf-profile.self.cycles-pp.ksys_read
1.26 +0.1 1.33 perf-profile.self.cycles-pp.entry_SYSCALL_64
1.36 +0.1 1.43 perf-profile.self.cycles-pp.xas_descend
1.12 +0.1 1.20 perf-profile.self.cycles-pp.r_test
1.18 +0.1 1.26 perf-profile.self.cycles-pp.xas_load
1.75 +0.1 1.83 perf-profile.self.cycles-pp.atime_needs_update
2.04 +0.1 2.15 perf-profile.self.cycles-pp.do_syscall_64
2.28 +0.2 2.44 perf-profile.self.cycles-pp.filemap_get_pages
3.47 +0.2 3.65 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
3.94 +0.2 4.16 perf-profile.self.cycles-pp.read
9.88 +0.4 10.28 perf-profile.self.cycles-pp.filemap_read
6.57 ± 3% +0.5 7.05 ± 2% perf-profile.self.cycles-pp.filemap_get_read_batch
28.21 +1.8 30.01 perf-profile.self.cycles-pp._copy_to_iter
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
next reply other threads:[~2024-03-14 8:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-14 8:23 kernel test robot [this message]
2024-03-14 9:21 ` [amir73il:fsnotify-sbconn] [fsnotify] 629f30e073: unixbench.throughput 5.8% improvement Amir Goldstein
2024-03-19 2:26 ` Oliver Sang
2024-03-19 9:56 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202403141505.807a722b-oliver.sang@intel.com \
--to=oliver.sang@intel.com \
--cc=amir73il@gmail.com \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).