Linux-HyperV Archive mirror
 help / color / mirror / Atom feed
From: Yury Norov <yury.norov@gmail.com>
To: Souradeep Chakrabarti <schakrabarti@microsoft.com>
Cc: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>,
	KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"edumazet@google.com" <edumazet@google.com>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"pabeni@redhat.com" <pabeni@redhat.com>,
	Long Li <longli@microsoft.com>,
	"leon@kernel.org" <leon@kernel.org>,
	"cai.huoqing@linux.dev" <cai.huoqing@linux.dev>,
	"ssengar@linux.microsoft.com" <ssengar@linux.microsoft.com>,
	"vkuznets@redhat.com" <vkuznets@redhat.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	Paul Rosswurm <paulros@microsoft.com>
Subject: Re: [EXTERNAL] [PATCH 3/3] net: mana: add a function to spread IRQs per CPUs
Date: Tue, 19 Dec 2023 06:03:49 -0800	[thread overview]
Message-ID: <ZYGixTdW4PYF3RjR@yury-ThinkPad> (raw)
In-Reply-To: <PUZP153MB07886CE88351F6B7A2AA0096CC97A@PUZP153MB0788.APCP153.PROD.OUTLOOK.COM>

On Tue, Dec 19, 2023 at 10:18:49AM +0000, Souradeep Chakrabarti wrote:
> 
> 
> >-----Original Message-----
> >From: Yury Norov <yury.norov@gmail.com>
> >Sent: Monday, December 18, 2023 3:02 AM
> >To: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>; KY Srinivasan
> ><kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>;
> >wei.liu@kernel.org; Dexuan Cui <decui@microsoft.com>; davem@davemloft.net;
> >edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Long Li
> ><longli@microsoft.com>; yury.norov@gmail.com; leon@kernel.org;
> >cai.huoqing@linux.dev; ssengar@linux.microsoft.com; vkuznets@redhat.com;
> >tglx@linutronix.de; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; linux-
> >kernel@vger.kernel.org; linux-rdma@vger.kernel.org
> >Cc: Souradeep Chakrabarti <schakrabarti@microsoft.com>; Paul Rosswurm
> ><paulros@microsoft.com>
> >Subject: [EXTERNAL] [PATCH 3/3] net: mana: add a function to spread IRQs per
> >CPUs
> >
> >[Some people who received this message don't often get email from
> >yury.norov@gmail.com. Learn why this is important at
> >https://aka.ms/LearnAboutSenderIdentification ]
> >
> >Souradeep investigated that the driver performs faster if IRQs are spread on CPUs
> >with the following heuristics:
> >
> >1. No more than one IRQ per CPU, if possible; 2. NUMA locality is the second
> >priority; 3. Sibling dislocality is the last priority.
> >
> >Let's consider this topology:
> >
> >Node            0               1
> >Core        0       1       2       3
> >CPU       0   1   2   3   4   5   6   7
> >
> >The most performant IRQ distribution based on the above topology and heuristics
> >may look like this:
> >
> >IRQ     Nodes   Cores   CPUs
> >0       1       0       0-1
> >1       1       1       2-3
> >2       1       0       0-1
> >3       1       1       2-3
> >4       2       2       4-5
> >5       2       3       6-7
> >6       2       2       4-5
> >7       2       3       6-7
> >
> >The irq_setup() routine introduced in this patch leverages the
> >for_each_numa_hop_mask() iterator and assigns IRQs to sibling groups as
> >described above.
> >
> >According to [1], for NUMA-aware but sibling-ignorant IRQ distribution based on
> >cpumask_local_spread() performance test results look like this:
> >
> >./ntttcp -r -m 16
> >NTTTCP for Linux 1.4.0
> >---------------------------------------------------------
> >08:05:20 INFO: 17 threads created
> >08:05:28 INFO: Network activity progressing...
> >08:06:28 INFO: Test run completed.
> >08:06:28 INFO: Test cycle finished.
> >08:06:28 INFO: #####  Totals:  #####
> >08:06:28 INFO: test duration    :60.00 seconds
> >08:06:28 INFO: total bytes      :630292053310
> >08:06:28 INFO:   throughput     :84.04Gbps
> >08:06:28 INFO:   retrans segs   :4
> >08:06:28 INFO: cpu cores        :192
> >08:06:28 INFO:   cpu speed      :3799.725MHz
> >08:06:28 INFO:   user           :0.05%
> >08:06:28 INFO:   system         :1.60%
> >08:06:28 INFO:   idle           :96.41%
> >08:06:28 INFO:   iowait         :0.00%
> >08:06:28 INFO:   softirq        :1.94%
> >08:06:28 INFO:   cycles/byte    :2.50
> >08:06:28 INFO: cpu busy (all)   :534.41%
> >
> >For NUMA- and sibling-aware IRQ distribution, the same test works 15% faster:
> >
> >./ntttcp -r -m 16
> >NTTTCP for Linux 1.4.0
> >---------------------------------------------------------
> >08:08:51 INFO: 17 threads created
> >08:08:56 INFO: Network activity progressing...
> >08:09:56 INFO: Test run completed.
> >08:09:56 INFO: Test cycle finished.
> >08:09:56 INFO: #####  Totals:  #####
> >08:09:56 INFO: test duration    :60.00 seconds
> >08:09:56 INFO: total bytes      :741966608384
> >08:09:56 INFO:   throughput     :98.93Gbps
> >08:09:56 INFO:   retrans segs   :6
> >08:09:56 INFO: cpu cores        :192
> >08:09:56 INFO:   cpu speed      :3799.791MHz
> >08:09:56 INFO:   user           :0.06%
> >08:09:56 INFO:   system         :1.81%
> >08:09:56 INFO:   idle           :96.18%
> >08:09:56 INFO:   iowait         :0.00%
> >08:09:56 INFO:   softirq        :1.95%
> >08:09:56 INFO:   cycles/byte    :2.25
> >08:09:56 INFO: cpu busy (all)   :569.22%
> >
> >[1]
> >https://lore.kernel/
> >.org%2Fall%2F20231211063726.GA4977%40linuxonhyperv3.guj3yctzbm1etfxqx2v
> >ob5hsef.xx.internal.cloudapp.net%2F&data=05%7C02%7Cschakrabarti%40micros
> >oft.com%7Ca385a5a5d661458219c208dbff47a7ab%7C72f988bf86f141af91ab2d7
> >cd011db47%7C1%7C0%7C638384455520036393%7CUnknown%7CTWFpbGZsb3d
> >8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> >7C3000%7C%7C%7C&sdata=kzoalzSu6frB0GIaUM5VWsz04%2FsB%2FBdXwXKb26
> >IhqkE%3D&reserved=0
> >
> >Signed-off-by: Yury Norov <yury.norov@gmail.com>
> >Co-developed-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
> >---
> > .../net/ethernet/microsoft/mana/gdma_main.c   | 28 +++++++++++++++++++
> > 1 file changed, 28 insertions(+)
> >
> >diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> >b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> >index 6367de0c2c2e..11e64e42e3b2 100644
> >--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> >+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> >@@ -1243,6 +1243,34 @@ void mana_gd_free_res_map(struct gdma_resource
> >*r)
> >        r->size = 0;
> > }
> >
> >+static __maybe_unused int irq_setup(unsigned int *irqs, unsigned int
> >+len, int node) {
> >+       const struct cpumask *next, *prev = cpu_none_mask;
> >+       cpumask_var_t cpus __free(free_cpumask_var);
> >+       int cpu, weight;
> >+
> >+       if (!alloc_cpumask_var(&cpus, GFP_KERNEL))
> >+               return -ENOMEM;
> >+
> >+       rcu_read_lock();
> >+       for_each_numa_hop_mask(next, node) {
> >+               weight = cpumask_weight_andnot(next, prev);
> >+               while (weight-- > 0) {
> Make it while (weight > 0) {
> >+                       cpumask_andnot(cpus, next, prev);
> >+                       for_each_cpu(cpu, cpus) {
> >+                               if (len-- == 0)
> >+                                       goto done;
> >+                               irq_set_affinity_and_hint(*irqs++,
> >topology_sibling_cpumask(cpu));
> >+                               cpumask_andnot(cpus, cpus, topology_sibling_cpumask(cpu));
> Here do --weight, else this code will traverse the same node N^2 times, where each
> node has N cpus .

Sure.

When building your series on top of this, can you please fix it
inplace?

Thanks,
Yury

> >+                       }
> >+               }
> >+               prev = next;
> >+       }
> >+done:
> >+       rcu_read_unlock();
> >+       return 0;
> >+}
> >+
> > static int mana_gd_setup_irqs(struct pci_dev *pdev)  {
> >        unsigned int max_queues_per_port = num_online_cpus();
> >--
> >2.40.1

  reply	other threads:[~2023-12-19 14:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-17 21:32 [PATCH 0/3] net: mana: add irq_spread() Yury Norov
2023-12-17 21:32 ` [PATCH 1/3] cpumask: add cpumask_weight_andnot() Yury Norov
2023-12-18 21:19   ` Jacob Keller
2023-12-17 21:32 ` [PATCH 2/3] cpumask: define cleanup function for cpumasks Yury Norov
2023-12-17 21:32 ` [PATCH 3/3] net: mana: add a function to spread IRQs per CPUs Yury Norov
2023-12-18 21:17   ` Jacob Keller
2023-12-18 21:42     ` Yury Norov
2023-12-19  7:14   ` [EXTERNAL] " Souradeep Chakrabarti
2023-12-19 10:18   ` Souradeep Chakrabarti
2023-12-19 14:03     ` Yury Norov [this message]
2023-12-18 21:18 ` [PATCH 0/3] net: mana: add irq_spread() Jacob Keller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZYGixTdW4PYF3RjR@yury-ThinkPad \
    --to=yury.norov@gmail.com \
    --cc=cai.huoqing@linux.dev \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=edumazet@google.com \
    --cc=haiyangz@microsoft.com \
    --cc=kuba@kernel.org \
    --cc=kys@microsoft.com \
    --cc=leon@kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=paulros@microsoft.com \
    --cc=schakrabarti@linux.microsoft.com \
    --cc=schakrabarti@microsoft.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).