oe-lkp.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: kernel test robot <oliver.sang@intel.com>
To: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>, Jakub Kicinski <kuba@kernel.org>,
	<netdev@vger.kernel.org>, <ying.huang@intel.com>,
	<feng.tang@intel.com>, <fengwei.yin@intel.com>,
	<oliver.sang@intel.com>
Subject: [linus:master] [af_unix]  d9f21b3613:  stress-ng.sockfd.ops_per_sec 9.1% improvement
Date: Fri, 15 Mar 2024 11:17:26 +0800	[thread overview]
Message-ID: <202403151041.2a9a00df-oliver.sang@intel.com> (raw)



Hello,

kernel test robot noticed a 9.1% improvement of stress-ng.sockfd.ops_per_sec on:


commit: d9f21b3613337b55cc9d4a6ead484dca68475143 ("af_unix: Try to run GC async.")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sockfd
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240315/202403151041.2a9a00df-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/sockfd/stress-ng/60s

commit: 
  8b90a9f819 ("af_unix: Run GC on only one CPU.")
  d9f21b3613 ("af_unix: Try to run GC async.")

8b90a9f819dc2a06 d9f21b3613337b55cc9d4a6ead4 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     25305 ±  4%      +9.7%      27753 ±  2%  perf-c2c.HITM.total
     64392            +1.8%      65544        vmstat.system.cs
   1926720            +1.4%    1954260        proc-vmstat.numa_hit
   1694682            +1.5%    1719926        proc-vmstat.numa_local
   3151070            +3.4%    3257664        proc-vmstat.pgalloc_normal
      0.28 ±  8%     -15.0%       0.24 ±  9%  sched_debug.cfs_rq:/.h_nr_running.stddev
    259.21 ±  7%     -12.9%     225.86 ±  6%  sched_debug.cfs_rq:/.runnable_avg.stddev
     23.78 ± 13%     -20.9%      18.80 ± 27%  sched_debug.cpu.clock.stddev
  50265901            +9.1%   54861338        stress-ng.sockfd.ops
    837446            +9.1%     913917        stress-ng.sockfd.ops_per_sec
   2293458            -2.8%    2230066        stress-ng.time.involuntary_context_switches
   1581490            +8.1%    1709261        stress-ng.time.voluntary_context_switches
  26480342            +4.2%   27595498        perf-stat.i.cache-misses
  90320805            +3.9%   93807170        perf-stat.i.cache-references
      9.86            -1.7%       9.70        perf-stat.i.cpi
     25274            -5.1%      23975        perf-stat.i.cycles-between-cache-misses
 6.498e+10            +1.1%  6.571e+10        perf-stat.i.instructions
      0.11            +1.7%       0.11        perf-stat.i.ipc
     10.00            -1.7%       9.83        perf-stat.overall.cpi
     24733            -4.7%      23575        perf-stat.overall.cycles-between-cache-misses
      0.10            +1.7%       0.10        perf-stat.overall.ipc
 1.438e+10            +1.3%  1.458e+10        perf-stat.ps.branch-instructions
  24920120            +4.9%   26142747        perf-stat.ps.cache-misses
  86987270            +4.5%   90934893        perf-stat.ps.cache-references
 6.162e+10            +1.7%  6.268e+10        perf-stat.ps.instructions
 3.698e+12            +2.2%  3.781e+12        perf-stat.total.instructions
     66.00 ± 70%     -49.5       16.45 ±223%  perf-profile.calltrace.cycles-pp.stress_sockfd
     33.12 ± 70%     -24.9        8.24 ±223%  perf-profile.calltrace.cycles-pp.sendmsg.stress_sockfd
     33.08 ± 70%     -24.9        8.23 ±223%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sendmsg.stress_sockfd
     33.08 ± 70%     -24.9        8.23 ±223%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendmsg.stress_sockfd
     33.05 ± 70%     -24.8        8.22 ±223%  perf-profile.calltrace.cycles-pp.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendmsg.stress_sockfd
     33.04 ± 70%     -24.8        8.22 ±223%  perf-profile.calltrace.cycles-pp.___sys_sendmsg.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendmsg
     32.99 ± 70%     -24.8        8.20 ±223%  perf-profile.calltrace.cycles-pp.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe
     32.95 ± 70%     -24.8        8.19 ±223%  perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64
     32.67 ± 70%     -24.5        8.16 ±223%  perf-profile.calltrace.cycles-pp.recvmsg.stress_sockfd
     32.65 ± 70%     -24.5        8.15 ±223%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recvmsg.stress_sockfd
     32.65 ± 70%     -24.5        8.15 ±223%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg.stress_sockfd
     32.64 ± 70%     -24.5        8.14 ±223%  perf-profile.calltrace.cycles-pp.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg.stress_sockfd
     32.63 ± 70%     -24.5        8.14 ±223%  perf-profile.calltrace.cycles-pp.___sys_recvmsg.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg
     32.60 ± 70%     -24.5        8.14 ±223%  perf-profile.calltrace.cycles-pp.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe
     32.60 ± 70%     -24.5        8.14 ±223%  perf-profile.calltrace.cycles-pp.sock_recvmsg.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg.do_syscall_64
     32.59 ± 70%     -24.5        8.13 ±223%  perf-profile.calltrace.cycles-pp.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg
     32.58 ± 70%     -24.5        8.13 ±223%  perf-profile.calltrace.cycles-pp.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg.___sys_recvmsg
     32.51 ± 70%     -24.4        8.10 ±223%  perf-profile.calltrace.cycles-pp.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg
     32.51 ± 70%     -24.4        8.10 ±223%  perf-profile.calltrace.cycles-pp.unix_attach_fds.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg.___sys_sendmsg
     32.44 ± 70%     -24.4        8.07 ±223%  perf-profile.calltrace.cycles-pp.unix_inflight.unix_attach_fds.unix_scm_to_skb.unix_stream_sendmsg.____sys_sendmsg
     32.43 ± 70%     -24.4        8.07 ±223%  perf-profile.calltrace.cycles-pp._raw_spin_lock.unix_inflight.unix_attach_fds.unix_scm_to_skb.unix_stream_sendmsg
     32.37 ± 70%     -24.3        8.06 ±223%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.unix_inflight.unix_attach_fds.unix_scm_to_skb
     32.31 ± 70%     -24.2        8.06 ±223%  perf-profile.calltrace.cycles-pp.unix_detach_fds.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg
     32.30 ± 70%     -24.2        8.06 ±223%  perf-profile.calltrace.cycles-pp.unix_notinflight.unix_detach_fds.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
     32.30 ± 70%     -24.2        8.06 ±223%  perf-profile.calltrace.cycles-pp._raw_spin_lock.unix_notinflight.unix_detach_fds.unix_stream_read_generic.unix_stream_recvmsg
     32.23 ± 70%     -24.2        8.04 ±223%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.unix_notinflight.unix_detach_fds.unix_stream_read_generic
     66.37 ± 70%     -49.8       16.57 ±223%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     66.36 ± 70%     -49.8       16.56 ±223%  perf-profile.children.cycles-pp.do_syscall_64
     66.00 ± 70%     -49.5       16.45 ±223%  perf-profile.children.cycles-pp.stress_sockfd
     64.86 ± 70%     -48.7       16.17 ±223%  perf-profile.children.cycles-pp._raw_spin_lock
     64.64 ± 70%     -48.5       16.11 ±223%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     33.13 ± 70%     -24.9        8.24 ±223%  perf-profile.children.cycles-pp.sendmsg
     33.06 ± 70%     -24.8        8.22 ±223%  perf-profile.children.cycles-pp.__sys_sendmsg
     33.04 ± 70%     -24.8        8.22 ±223%  perf-profile.children.cycles-pp.___sys_sendmsg
     32.99 ± 70%     -24.8        8.20 ±223%  perf-profile.children.cycles-pp.____sys_sendmsg
     32.95 ± 70%     -24.8        8.19 ±223%  perf-profile.children.cycles-pp.unix_stream_sendmsg
     32.68 ± 70%     -24.5        8.16 ±223%  perf-profile.children.cycles-pp.recvmsg
     32.64 ± 70%     -24.5        8.15 ±223%  perf-profile.children.cycles-pp.__sys_recvmsg
     32.63 ± 70%     -24.5        8.14 ±223%  perf-profile.children.cycles-pp.___sys_recvmsg
     32.61 ± 70%     -24.5        8.14 ±223%  perf-profile.children.cycles-pp.____sys_recvmsg
     32.60 ± 70%     -24.5        8.14 ±223%  perf-profile.children.cycles-pp.sock_recvmsg
     32.59 ± 70%     -24.5        8.13 ±223%  perf-profile.children.cycles-pp.unix_stream_read_generic
     32.59 ± 70%     -24.5        8.13 ±223%  perf-profile.children.cycles-pp.unix_stream_recvmsg
     32.51 ± 70%     -24.4        8.10 ±223%  perf-profile.children.cycles-pp.unix_scm_to_skb
     32.51 ± 70%     -24.4        8.10 ±223%  perf-profile.children.cycles-pp.unix_attach_fds
     32.44 ± 70%     -24.4        8.07 ±223%  perf-profile.children.cycles-pp.unix_inflight
     32.31 ± 70%     -24.2        8.06 ±223%  perf-profile.children.cycles-pp.unix_detach_fds
     32.30 ± 70%     -24.2        8.06 ±223%  perf-profile.children.cycles-pp.unix_notinflight
     64.36 ± 70%     -48.3       16.04 ±223%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


                 reply	other threads:[~2024-03-15  3:17 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202403151041.2a9a00df-oliver.sang@intel.com \
    --to=oliver.sang@intel.com \
    --cc=feng.tang@intel.com \
    --cc=fengwei.yin@intel.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).