From: Henning Fehrmann <henning.fehrmann@aei.mpg.de>
To: xdp-newbies@vger.kernel.org
Subject: traffic redirection and tapping
Date: Thu, 29 Sep 2022 16:56:21 +0200 [thread overview]
Message-ID: <YzWyFYccrHxKGsrQ@mephisto.aei.uni-hannover.de> (raw)
Hey folks,
we have a node with 4 dual-port MT28908 Mellanox cards installed. We want to
redirect the incoming traffic from an ingress port to an egress port on
the same NIC using the bpf_redirect() helper function.
Each ingress port redirects traffic of ca 25Gibps with a packet size of ca
1500Bytes. It is almost solely UDP multicast traffic. There are four
multicast groups per bridge coming from 64 different source IPs.
I raised the rx ring buffer size of the ingress ports to 8192
and aggregated the interrupts:
ethtool -C ingress_port_i rx-frames 512
ethtool -C ingress_port_i rx-usecs 16
I still need to check whether these numbers make sense.
The CPU utilization is moderate around 20-30%.
On top of it we'd like to record the traffic streams using
the bpf_ringbuf_output function.
For now I write only into the ringbuffer to make the data available in
user space. I have 16 different ringbuffer, one for each CPU. I am
currently not sure how to enforce that the ringbuffer sits in the right NUMA
node. Is there a way?
numastat tells me that I have zero numa misses so that is possibly OK.
In user land I start 16 threads pinned to the cores running a handler to process the
ringbuffer content. Currently I only count packets.
Doing it all cores are entirely utilized and I lose packets.
perf top tells me (only CPU 8):
PerfTop: 3868 irqs/sec kernel:95.9% exact: 97.6% lost: 0/0 drop: 0/0 [4000Hz cycles], (all, CPU: 8)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
25.72% 11804 [kernel] [k] xdp_do_redirect
8.73% 4008 [kernel] [k] memcpy
7.19% 3297 [kernel] [k] bq_enqueue
7.16% 3287 [kernel] [k] check_preemption_disabled
4.28% 1959 [kernel] [k] bpf_ringbuf_output
4.20% 1925 [kernel] [k] mlx5e_xdp_handle
and in perf record I indeed find the memcopy issue + some load for mlx5e_napi_poll:
--45.35%--__napi_poll
mlx5e_napi_poll
|
|--36.80%--mlx5e_poll_rx_cq
| |
| |--35.64%--mlx5e_handle_rx_cqe_mpwrq
| | |
| | --35.56%--mlx5e_skb_from_cqe_mpwrq_linear
| | |
| | --34.57%--mlx5e_xdp_handle
| | |
| | |--32.15%--bpf_prog_82775e2abf7feec0_xdp_tap_ingress_prog
| | | |
| | | --31.20%--bpf_ringbuf_output
| | | |
| | | --30.69%--memcpy
| | | |
| | | --0.77%--asm_common_interrupt
| | | common_interrupt
| | | |
| | | --0.72%--__common_interrupt
| | | handle_edge_irq
| | | |
Is there any chance to improve the ringbuffer output?
Or could I get the packets onto disks in any other way using bpf helper
functions? Do I need to gather more or other information?
Thank you for your help.
Cheers,
Henning
next reply other threads:[~2022-09-29 15:05 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-29 14:56 Henning Fehrmann [this message]
2022-09-30 14:35 ` traffic redirection and tapping David Ahern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YzWyFYccrHxKGsrQ@mephisto.aei.uni-hannover.de \
--to=henning.fehrmann@aei.mpg.de \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).