All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/2] ethtool: provide the dim profile fine-tuning channel
@ 2024-03-27  9:19 Heng Qi
  2024-03-27  9:19 ` [PATCH net-next v2 1/2] ethtool: provide customized dim profile management Heng Qi
  2024-03-27  9:19 ` [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning Heng Qi
  0 siblings, 2 replies; 14+ messages in thread
From: Heng Qi @ 2024-03-27  9:19 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, Eric Dumazet, Paolo Abeni,
	Vladimir Oltean, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Xuan Zhuo

The NetDIM library provides excellent acceleration for many modern
network cards. However, the default profiles of DIM limits its maximum
capabilities for different NICs, so providing a way reusing "ethtool -C"
through which the NIC can be custom configured is necessary.

Please review, thank you very much!

Changelog
======
v1->v2:
  - Use ethtool tool instead of net-sysfs

V1 link:
https://lore.kernel.org/all/1710421773-61277-1-git-send-email-hengqi@linux.alibaba.com/#r

Heng Qi (2):
  ethtool: provide customized dim profile management
  virtio-net: support dim profile fine-tuning

 Documentation/networking/ethtool-netlink.rst |  8 +++++
 drivers/net/virtio_net.c                     | 54 ++++++++++++++++++++++++++--
 include/linux/dim.h                          |  7 ++++
 include/linux/ethtool.h                      | 16 ++++++++-
 include/uapi/linux/ethtool_netlink.h         |  4 +++
 lib/dim/net_dim.c                            |  6 ----
 net/ethtool/coalesce.c                       | 51 ++++++++++++++++++++++++--
 net/ethtool/netlink.h                        | 35 ++++++++++++++++++
 8 files changed, 170 insertions(+), 11 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next v2 1/2] ethtool: provide customized dim profile management
  2024-03-27  9:19 [PATCH net-next v2 0/2] ethtool: provide the dim profile fine-tuning channel Heng Qi
@ 2024-03-27  9:19 ` Heng Qi
  2024-03-27  9:19 ` [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning Heng Qi
  1 sibling, 0 replies; 14+ messages in thread
From: Heng Qi @ 2024-03-27  9:19 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, Eric Dumazet, Paolo Abeni,
	Vladimir Oltean, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Xuan Zhuo

The NetDIM library, currently leveraged by an array of NICs, delivers
excellent acceleration benefits. Nevertheless, NICs vary significantly
in their dim profile list prerequisites.

Specifically, virtio-net backends may present diverse sw or hw device
implementation, making a one-size-fits-all parameter list impractical.
On Alibaba Cloud, the virtio DPU's performance under the default DIM
profile falls short of expectations, partly due to a mismatch in
parameter configuration.

I also noticed that ice/idpf/ena and other NICs have customized
profilelist or placed some restrictions on dim capabilities.

Motivated by this, I tried adding new params for "ethtool -C" that provides
a per-device control to modify and access a device's interrupt parameters.

Usage
========
1. Query the currently customized list of the device

$ ethtool -c ethx
...
rx-eqe-profs:
{.usec =   1, .pkts = 256, .comps =   0,},
{.usec =   8, .pkts = 256, .comps =   0,},
{.usec =  64, .pkts = 256, .comps =   0,},
{.usec = 128, .pkts = 256, .comps =   0,},
rx-cqe-profs:   n/a
tx-eqe-profs:   n/a
tx-cqe-profs:   n/a

2. Tune
$ ethtool -C ethx rx-eqe-profs 1,1,0_2,2,0_3,3,0_4,4,0_5,5,0
$ ethtool -c ethx
...
rx-eqe-profs:
{.usec =   1, .pkts =   1, .comps =   0,},
{.usec =   2, .pkts =   2, .comps =   0,},
{.usec =   3, .pkts =   3, .comps =   0,},
{.usec =   4, .pkts =   4, .comps =   0,},
{.usec =   5, .pkts =   5, .comps =   0,}
rx-cqe-profs:   n/a
tx-eqe-profs:   n/a
tx-cqe-profs:   n/a

3. Hint
If the device does not support some type of customized dim
profiles, the corresponding "n/a" will display.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
---
 Documentation/networking/ethtool-netlink.rst |  8 +++++
 include/linux/dim.h                          |  7 ++++
 include/linux/ethtool.h                      | 16 ++++++++-
 include/uapi/linux/ethtool_netlink.h         |  4 +++
 lib/dim/net_dim.c                            |  6 ----
 net/ethtool/coalesce.c                       | 51 ++++++++++++++++++++++++++--
 net/ethtool/netlink.h                        | 35 +++++++++++++++++++
 7 files changed, 118 insertions(+), 9 deletions(-)

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index d583d9a..eb5be7b3 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -1033,6 +1033,10 @@ Kernel response contents:
   ``ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES``     u32     max aggr size, Tx
   ``ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES``    u32     max aggr packets, Tx
   ``ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS``    u32     time (us), aggr, Tx
+  ``ETHTOOL_A_COALESCE_RX_EQE_PROFILE``        string  profile of DIM EQE, Rx
+  ``ETHTOOL_A_COALESCE_RX_CQE_PROFILE``        string  profile of DIM CQE, Rx
+  ``ETHTOOL_A_COALESCE_TX_EQE_PROFILE``        string  profile of DIM EQE, Tx
+  ``ETHTOOL_A_COALESCE_TX_CQE_PROFILE``        string  profile of DIM CQE, Tx
   ===========================================  ======  =======================
 
 Attributes are only included in reply if their value is not zero or the
@@ -1098,6 +1102,10 @@ Request contents:
   ``ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES``     u32     max aggr size, Tx
   ``ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES``    u32     max aggr packets, Tx
   ``ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS``    u32     time (us), aggr, Tx
+  ``ETHTOOL_A_COALESCE_RX_EQE_PROFILE``        string  profile of DIM EQE, Rx
+  ``ETHTOOL_A_COALESCE_RX_CQE_PROFILE``        string  profile of DIM CQE, Rx
+  ``ETHTOOL_A_COALESCE_TX_EQE_PROFILE``        string  profile of DIM EQE, Tx
+  ``ETHTOOL_A_COALESCE_TX_CQE_PROFILE``        string  profile of DIM CQE, Tx
   ===========================================  ======  =======================
 
 Request is rejected if it attributes declared as unsupported by driver (i.e.
diff --git a/include/linux/dim.h b/include/linux/dim.h
index f343bc9..43398f5 100644
--- a/include/linux/dim.h
+++ b/include/linux/dim.h
@@ -10,6 +10,13 @@
 #include <linux/types.h>
 #include <linux/workqueue.h>
 
+/* Number of DIM profiles and period mode. */
+#define NET_DIM_PARAMS_NUM_PROFILES 5
+#define NET_DIM_DEFAULT_RX_CQ_PKTS_FROM_EQE 256
+#define NET_DIM_DEFAULT_TX_CQ_PKTS_FROM_EQE 128
+#define NET_DIM_DEF_PROFILE_CQE 1
+#define NET_DIM_DEF_PROFILE_EQE 1
+
 /*
  * Number of events between DIM iterations.
  * Causes a moderation of the algorithm run.
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 9901e56..0fd81e8 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -18,6 +18,7 @@
 #include <linux/if_ether.h>
 #include <linux/netlink.h>
 #include <uapi/linux/ethtool.h>
+#include <linux/dim.h>
 
 struct compat_ethtool_rx_flow_spec {
 	u32		flow_type;
@@ -238,6 +239,10 @@ struct kernel_ethtool_coalesce {
 	u32 tx_aggr_max_bytes;
 	u32 tx_aggr_max_frames;
 	u32 tx_aggr_time_usecs;
+	struct dim_cq_moder rx_eqe_profs[NET_DIM_PARAMS_NUM_PROFILES];
+	struct dim_cq_moder rx_cqe_profs[NET_DIM_PARAMS_NUM_PROFILES];
+	struct dim_cq_moder tx_eqe_profs[NET_DIM_PARAMS_NUM_PROFILES];
+	struct dim_cq_moder tx_cqe_profs[NET_DIM_PARAMS_NUM_PROFILES];
 };
 
 /**
@@ -284,7 +289,11 @@ bool ethtool_convert_link_mode_to_legacy_u32(u32 *legacy_u32,
 #define ETHTOOL_COALESCE_TX_AGGR_MAX_BYTES	BIT(24)
 #define ETHTOOL_COALESCE_TX_AGGR_MAX_FRAMES	BIT(25)
 #define ETHTOOL_COALESCE_TX_AGGR_TIME_USECS	BIT(26)
-#define ETHTOOL_COALESCE_ALL_PARAMS		GENMASK(26, 0)
+#define ETHTOOL_COALESCE_RX_EQE_PROFILE         BIT(27)
+#define ETHTOOL_COALESCE_RX_CQE_PROFILE         BIT(28)
+#define ETHTOOL_COALESCE_TX_EQE_PROFILE         BIT(29)
+#define ETHTOOL_COALESCE_TX_CQE_PROFILE         BIT(30)
+#define ETHTOOL_COALESCE_ALL_PARAMS		GENMASK(30, 0)
 
 #define ETHTOOL_COALESCE_USECS						\
 	(ETHTOOL_COALESCE_RX_USECS | ETHTOOL_COALESCE_TX_USECS)
@@ -316,6 +325,11 @@ bool ethtool_convert_link_mode_to_legacy_u32(u32 *legacy_u32,
 	(ETHTOOL_COALESCE_TX_AGGR_MAX_BYTES |	\
 	 ETHTOOL_COALESCE_TX_AGGR_MAX_FRAMES |	\
 	 ETHTOOL_COALESCE_TX_AGGR_TIME_USECS)
+#define ETHTOOL_COALESCE_PROFILE		\
+	(ETHTOOL_COALESCE_RX_EQE_PROFILE |	\
+	 ETHTOOL_COALESCE_RX_CQE_PROFILE |	\
+	 ETHTOOL_COALESCE_TX_EQE_PROFILE |	\
+	 ETHTOOL_COALESCE_TX_CQE_PROFILE)
 
 #define ETHTOOL_STAT_NOT_SET	(~0ULL)
 
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 3f89074..b7c0b81 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -416,6 +416,10 @@ enum {
 	ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES,		/* u32 */
 	ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES,		/* u32 */
 	ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS,		/* u32 */
+	ETHTOOL_A_COALESCE_RX_EQE_PROFILE,              /* string */
+	ETHTOOL_A_COALESCE_RX_CQE_PROFILE,              /* string */
+	ETHTOOL_A_COALESCE_TX_EQE_PROFILE,              /* string */
+	ETHTOOL_A_COALESCE_TX_CQE_PROFILE,              /* string */
 
 	/* add new constants above here */
 	__ETHTOOL_A_COALESCE_CNT,
diff --git a/lib/dim/net_dim.c b/lib/dim/net_dim.c
index 4e32f7a..67d5beb 100644
--- a/lib/dim/net_dim.c
+++ b/lib/dim/net_dim.c
@@ -11,12 +11,6 @@
  *        There are different set of profiles for RX/TX CQs.
  *        Each profile size must be of NET_DIM_PARAMS_NUM_PROFILES
  */
-#define NET_DIM_PARAMS_NUM_PROFILES 5
-#define NET_DIM_DEFAULT_RX_CQ_PKTS_FROM_EQE 256
-#define NET_DIM_DEFAULT_TX_CQ_PKTS_FROM_EQE 128
-#define NET_DIM_DEF_PROFILE_CQE 1
-#define NET_DIM_DEF_PROFILE_EQE 1
-
 #define NET_DIM_RX_EQE_PROFILES { \
 	{.usec = 1,   .pkts = NET_DIM_DEFAULT_RX_CQ_PKTS_FROM_EQE,}, \
 	{.usec = 8,   .pkts = NET_DIM_DEFAULT_RX_CQ_PKTS_FROM_EQE,}, \
diff --git a/net/ethtool/coalesce.c b/net/ethtool/coalesce.c
index 83112c1..b5d5bc1 100644
--- a/net/ethtool/coalesce.c
+++ b/net/ethtool/coalesce.c
@@ -51,6 +51,10 @@ static u32 attr_to_mask(unsigned int attr_type)
 __CHECK_SUPPORTED_OFFSET(COALESCE_TX_USECS_HIGH);
 __CHECK_SUPPORTED_OFFSET(COALESCE_TX_MAX_FRAMES_HIGH);
 __CHECK_SUPPORTED_OFFSET(COALESCE_RATE_SAMPLE_INTERVAL);
+__CHECK_SUPPORTED_OFFSET(COALESCE_RX_EQE_PROFILE);
+__CHECK_SUPPORTED_OFFSET(COALESCE_RX_CQE_PROFILE);
+__CHECK_SUPPORTED_OFFSET(COALESCE_TX_EQE_PROFILE);
+__CHECK_SUPPORTED_OFFSET(COALESCE_TX_CQE_PROFILE);
 
 const struct nla_policy ethnl_coalesce_get_policy[] = {
 	[ETHTOOL_A_COALESCE_HEADER]		=
@@ -108,7 +112,9 @@ static int coalesce_reply_size(const struct ethnl_req_info *req_base,
 	       nla_total_size(sizeof(u8)) +	/* _USE_CQE_MODE_RX */
 	       nla_total_size(sizeof(u32)) +	/* _TX_AGGR_MAX_BYTES */
 	       nla_total_size(sizeof(u32)) +	/* _TX_AGGR_MAX_FRAMES */
-	       nla_total_size(sizeof(u32));	/* _TX_AGGR_TIME_USECS */
+	       nla_total_size(sizeof(u32)) +	/* _TX_AGGR_TIME_USECS */
+	       nla_total_size(sizeof(struct dim_cq_moder) *
+			      NET_DIM_PARAMS_NUM_PROFILES) * 4; /* _{R,T}X_{E,C}QE_PROFILE */
 }
 
 static bool coalesce_put_u32(struct sk_buff *skb, u16 attr_type, u32 val,
@@ -127,6 +133,27 @@ static bool coalesce_put_bool(struct sk_buff *skb, u16 attr_type, u32 val,
 	return nla_put_u8(skb, attr_type, !!val);
 }
 
+static bool coalesce_put_dim_profs(struct sk_buff *skb, u16 attr_type,
+				   const struct dim_cq_moder *profs,
+				   u32 supported_params)
+{
+	bool valid = false;
+	int i;
+
+	for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++) {
+		if (profs[i].usec || profs[i].pkts || profs[i].comps) {
+			valid = true;
+			break;
+		}
+	}
+
+	if (!valid || !(supported_params & attr_to_mask(attr_type)))
+		return false;
+
+	return nla_put(skb, attr_type, sizeof(struct dim_cq_moder) *
+		       NET_DIM_PARAMS_NUM_PROFILES, profs);
+}
+
 static int coalesce_fill_reply(struct sk_buff *skb,
 			       const struct ethnl_req_info *req_base,
 			       const struct ethnl_reply_data *reply_base)
@@ -189,7 +216,15 @@ static int coalesce_fill_reply(struct sk_buff *skb,
 	    coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES,
 			     kcoal->tx_aggr_max_frames, supported) ||
 	    coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS,
-			     kcoal->tx_aggr_time_usecs, supported))
+			     kcoal->tx_aggr_time_usecs, supported) ||
+	    coalesce_put_dim_profs(skb, ETHTOOL_A_COALESCE_RX_EQE_PROFILE,
+				   kcoal->rx_eqe_profs, supported) ||
+	    coalesce_put_dim_profs(skb, ETHTOOL_A_COALESCE_RX_CQE_PROFILE,
+				   kcoal->rx_cqe_profs, supported) ||
+	    coalesce_put_dim_profs(skb, ETHTOOL_A_COALESCE_TX_EQE_PROFILE,
+				   kcoal->tx_eqe_profs, supported) ||
+	    coalesce_put_dim_profs(skb, ETHTOOL_A_COALESCE_TX_CQE_PROFILE,
+				   kcoal->tx_cqe_profs, supported))
 		return -EMSGSIZE;
 
 	return 0;
@@ -227,6 +262,10 @@ static int coalesce_fill_reply(struct sk_buff *skb,
 	[ETHTOOL_A_COALESCE_TX_AGGR_MAX_BYTES] = { .type = NLA_U32 },
 	[ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES] = { .type = NLA_U32 },
 	[ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS] = { .type = NLA_U32 },
+	[ETHTOOL_A_COALESCE_RX_EQE_PROFILE]     = { .type = NLA_NUL_STRING },
+	[ETHTOOL_A_COALESCE_RX_CQE_PROFILE]     = { .type = NLA_NUL_STRING },
+	[ETHTOOL_A_COALESCE_TX_EQE_PROFILE]     = { .type = NLA_NUL_STRING },
+	[ETHTOOL_A_COALESCE_TX_CQE_PROFILE]     = { .type = NLA_NUL_STRING },
 };
 
 static int
@@ -316,6 +355,14 @@ static int coalesce_fill_reply(struct sk_buff *skb,
 			 tb[ETHTOOL_A_COALESCE_TX_AGGR_MAX_FRAMES], &mod);
 	ethnl_update_u32(&kernel_coalesce.tx_aggr_time_usecs,
 			 tb[ETHTOOL_A_COALESCE_TX_AGGR_TIME_USECS], &mod);
+	ethnl_update_profs(kernel_coalesce.rx_eqe_profs,
+			   tb[ETHTOOL_A_COALESCE_RX_EQE_PROFILE], &mod);
+	ethnl_update_profs(kernel_coalesce.rx_cqe_profs,
+			   tb[ETHTOOL_A_COALESCE_RX_CQE_PROFILE], &mod);
+	ethnl_update_profs(kernel_coalesce.tx_eqe_profs,
+			   tb[ETHTOOL_A_COALESCE_TX_EQE_PROFILE], &mod);
+	ethnl_update_profs(kernel_coalesce.tx_cqe_profs,
+			   tb[ETHTOOL_A_COALESCE_TX_CQE_PROFILE], &mod);
 
 	/* Update operation modes */
 	ethnl_update_bool32(&coalesce.use_adaptive_rx_coalesce,
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 9a333a8..5a30879 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -85,6 +85,41 @@ static inline void ethnl_update_u32(u32 *dst, const struct nlattr *attr,
 	*mod = true;
 }
 
+static inline void ethnl_update_profs(struct dim_cq_moder *dst,
+				      const struct nlattr *attr,
+				      bool *mod)
+{
+	struct dim_cq_moder profs[NET_DIM_PARAMS_NUM_PROFILES];
+	int ret, i, totlen = 0, retlen;
+	char *buf;
+	u16 len;
+
+	if (!attr)
+		return;
+
+	buf = nla_data(attr);
+	len = nla_len(attr);
+	for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++) {
+		ret = sscanf(buf, "%hu,%hu,%hu_%n", &profs[i].usec,
+			     &profs[i].pkts, &profs[i].comps, &retlen);
+		if (ret != 3)
+			return;
+
+		totlen += retlen;
+		if (totlen > len)
+			return;
+
+		if (dst[i].usec  != profs[i].usec ||
+		    dst[i].pkts  != profs[i].pkts ||
+		    dst[i].comps != profs[i].comps)
+			*mod = true;
+
+		buf += retlen;
+	}
+
+	memcpy(dst, profs, sizeof(profs));
+}
+
 /**
  * ethnl_update_u8() - update u8 value from NLA_U8 attribute
  * @dst:  value to update
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-27  9:19 [PATCH net-next v2 0/2] ethtool: provide the dim profile fine-tuning channel Heng Qi
  2024-03-27  9:19 ` [PATCH net-next v2 1/2] ethtool: provide customized dim profile management Heng Qi
@ 2024-03-27  9:19 ` Heng Qi
  2024-03-27 14:45   ` Alexander Lobakin
  2024-03-28  7:27   ` Michael S. Tsirkin
  1 sibling, 2 replies; 14+ messages in thread
From: Heng Qi @ 2024-03-27  9:19 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, Eric Dumazet, Paolo Abeni,
	Vladimir Oltean, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Xuan Zhuo

Virtio-net has different types of back-end device
implementations. In order to effectively optimize
the dim library's gains for different device
implementations, let's use the new interface params
to fine-tune the profile list.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
---
 drivers/net/virtio_net.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 52 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index e709d44..9b6c727 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -57,6 +57,16 @@
 
 #define VIRTNET_DRIVER_VERSION "1.0.0"
 
+/* This is copied from NET_DIM_RX_EQE_PROFILES in DIM library */
+#define VIRTNET_DIM_RX_PKTS 256
+static struct dim_cq_moder rx_eqe_conf[] = {
+	{.usec = 1,   .pkts = VIRTNET_DIM_RX_PKTS,},
+	{.usec = 8,   .pkts = VIRTNET_DIM_RX_PKTS,},
+	{.usec = 64,  .pkts = VIRTNET_DIM_RX_PKTS,},
+	{.usec = 128, .pkts = VIRTNET_DIM_RX_PKTS,},
+	{.usec = 256, .pkts = VIRTNET_DIM_RX_PKTS,}
+};
+
 static const unsigned long guest_offloads[] = {
 	VIRTIO_NET_F_GUEST_TSO4,
 	VIRTIO_NET_F_GUEST_TSO6,
@@ -3584,7 +3594,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
 		if (!rq->dim_enabled)
 			continue;
 
-		update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
+		if (dim->profile_ix >= ARRAY_SIZE(rx_eqe_conf))
+			dim->profile_ix = ARRAY_SIZE(rx_eqe_conf) - 1;
+
+		update_moder = rx_eqe_conf[dim->profile_ix];
 		if (update_moder.usec != rq->intr_coal.max_usecs ||
 		    update_moder.pkts != rq->intr_coal.max_packets) {
 			err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
@@ -3627,6 +3640,34 @@ static int virtnet_should_update_vq_weight(int dev_flags, int weight,
 	return 0;
 }
 
+static int virtnet_update_profile(struct virtnet_info *vi,
+				  struct kernel_ethtool_coalesce *kc)
+{
+	int i;
+
+	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) {
+		for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++)
+			if (kc->rx_eqe_profs[i].comps)
+				return -EINVAL;
+	} else {
+		for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++) {
+			if (kc->rx_eqe_profs[i].usec != rx_eqe_conf[i].usec ||
+			    kc->rx_eqe_profs[i].pkts != rx_eqe_conf[i].pkts ||
+			    kc->rx_eqe_profs[i].comps)
+				return -EINVAL;
+		}
+
+		return 0;
+	}
+
+	for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++) {
+		rx_eqe_conf[i].usec = kc->rx_eqe_profs[i].usec;
+		rx_eqe_conf[i].pkts = kc->rx_eqe_profs[i].pkts;
+	}
+
+	return 0;
+}
+
 static int virtnet_set_coalesce(struct net_device *dev,
 				struct ethtool_coalesce *ec,
 				struct kernel_ethtool_coalesce *kernel_coal,
@@ -3653,6 +3694,10 @@ static int virtnet_set_coalesce(struct net_device *dev,
 		}
 	}
 
+	ret = virtnet_update_profile(vi, kernel_coal);
+	if (ret)
+		return ret;
+
 	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_NOTF_COAL))
 		ret = virtnet_send_notf_coal_cmds(vi, ec);
 	else
@@ -3689,6 +3734,10 @@ static int virtnet_get_coalesce(struct net_device *dev,
 			ec->tx_max_coalesced_frames = 1;
 	}
 
+	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL))
+		memcpy(kernel_coal->rx_eqe_profs, rx_eqe_conf,
+		       sizeof(rx_eqe_conf));
+
 	return 0;
 }
 
@@ -3868,7 +3917,8 @@ static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
 
 static const struct ethtool_ops virtnet_ethtool_ops = {
 	.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
-		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
+		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX |
+		ETHTOOL_COALESCE_RX_EQE_PROFILE,
 	.get_drvinfo = virtnet_get_drvinfo,
 	.get_link = ethtool_op_get_link,
 	.get_ringparam = virtnet_get_ringparam,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-27  9:19 ` [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning Heng Qi
@ 2024-03-27 14:45   ` Alexander Lobakin
  2024-03-28  0:32     ` Jakub Kicinski
                       ` (2 more replies)
  2024-03-28  7:27   ` Michael S. Tsirkin
  1 sibling, 3 replies; 14+ messages in thread
From: Alexander Lobakin @ 2024-03-27 14:45 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, Jakub Kicinski, Eric Dumazet, Paolo Abeni,
	Vladimir Oltean, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Xuan Zhuo

From: Heng Qi <hengqi@linux.alibaba.com>
Date: Wed, 27 Mar 2024 17:19:06 +0800

> Virtio-net has different types of back-end device
> implementations. In order to effectively optimize
> the dim library's gains for different device
> implementations, let's use the new interface params
> to fine-tune the profile list.

Nice idea, but

> 
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
>  drivers/net/virtio_net.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 52 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index e709d44..9b6c727 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -57,6 +57,16 @@
>  
>  #define VIRTNET_DRIVER_VERSION "1.0.0"
>  
> +/* This is copied from NET_DIM_RX_EQE_PROFILES in DIM library */
> +#define VIRTNET_DIM_RX_PKTS 256
> +static struct dim_cq_moder rx_eqe_conf[] = {
> +	{.usec = 1,   .pkts = VIRTNET_DIM_RX_PKTS,},
> +	{.usec = 8,   .pkts = VIRTNET_DIM_RX_PKTS,},
> +	{.usec = 64,  .pkts = VIRTNET_DIM_RX_PKTS,},
> +	{.usec = 128, .pkts = VIRTNET_DIM_RX_PKTS,},
> +	{.usec = 256, .pkts = VIRTNET_DIM_RX_PKTS,}
> +};

This is wrong.
This way you will have one global table for ALL the virtio devices in
the system, while Ethtool performs configuration on a per-netdevice basis.
What you need is to have 1 dim_cq_moder per each virtio netdevice,
embedded somewhere into its netdev_priv(). Then
virtio_dim_{rx,tx}_work() will take profiles from there, not the global
struct. The global struct can stay here as const to initialize default
per-netdevice params.

> +
>  static const unsigned long guest_offloads[] = {
>  	VIRTIO_NET_F_GUEST_TSO4,
>  	VIRTIO_NET_F_GUEST_TSO6,

Thanks,
Olek

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-27 14:45   ` Alexander Lobakin
@ 2024-03-28  0:32     ` Jakub Kicinski
  2024-03-28  2:12       ` Xuan Zhuo
  2024-03-28  2:26       ` Heng Qi
  2024-03-28  2:02     ` Heng Qi
  2024-03-28  7:27     ` Michael S. Tsirkin
  2 siblings, 2 replies; 14+ messages in thread
From: Jakub Kicinski @ 2024-03-28  0:32 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Heng Qi, netdev, Eric Dumazet, Paolo Abeni, Vladimir Oltean,
	David S. Miller, Jason Wang, Michael S. Tsirkin, Xuan Zhuo,
	vadim.fedorenko

On Wed, 27 Mar 2024 15:45:50 +0100 Alexander Lobakin wrote:
> > +/* This is copied from NET_DIM_RX_EQE_PROFILES in DIM library */
> > +#define VIRTNET_DIM_RX_PKTS 256
> > +static struct dim_cq_moder rx_eqe_conf[] = {
> > +	{.usec = 1,   .pkts = VIRTNET_DIM_RX_PKTS,},
> > +	{.usec = 8,   .pkts = VIRTNET_DIM_RX_PKTS,},
> > +	{.usec = 64,  .pkts = VIRTNET_DIM_RX_PKTS,},
> > +	{.usec = 128, .pkts = VIRTNET_DIM_RX_PKTS,},
> > +	{.usec = 256, .pkts = VIRTNET_DIM_RX_PKTS,}
> > +};  
> 
> This is wrong.
> This way you will have one global table for ALL the virtio devices in
> the system, while Ethtool performs configuration on a per-netdevice basis.
> What you need is to have 1 dim_cq_moder per each virtio netdevice,
> embedded somewhere into its netdev_priv(). Then
> virtio_dim_{rx,tx}_work() will take profiles from there, not the global
> struct. The global struct can stay here as const to initialize default
> per-netdevice params.

I've been wondering lately if adaptive IRQ moderation isn't exactly
the kind of heuristic we would be best off deferring to BPF.
I have done 0 experiments -- are the thresholds enough
or do more interesting algos come to mind for anyone?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-27 14:45   ` Alexander Lobakin
  2024-03-28  0:32     ` Jakub Kicinski
@ 2024-03-28  2:02     ` Heng Qi
  2024-03-28  7:27     ` Michael S. Tsirkin
  2 siblings, 0 replies; 14+ messages in thread
From: Heng Qi @ 2024-03-28  2:02 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: netdev, Jakub Kicinski, Eric Dumazet, Paolo Abeni,
	Vladimir Oltean, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Xuan Zhuo



在 2024/3/27 下午10:45, Alexander Lobakin 写道:
> From: Heng Qi <hengqi@linux.alibaba.com>
> Date: Wed, 27 Mar 2024 17:19:06 +0800
>
>> Virtio-net has different types of back-end device
>> implementations. In order to effectively optimize
>> the dim library's gains for different device
>> implementations, let's use the new interface params
>> to fine-tune the profile list.
> Nice idea, but
>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> ---
>>   drivers/net/virtio_net.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 52 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index e709d44..9b6c727 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -57,6 +57,16 @@
>>   
>>   #define VIRTNET_DRIVER_VERSION "1.0.0"
>>   
>> +/* This is copied from NET_DIM_RX_EQE_PROFILES in DIM library */
>> +#define VIRTNET_DIM_RX_PKTS 256
>> +static struct dim_cq_moder rx_eqe_conf[] = {
>> +	{.usec = 1,   .pkts = VIRTNET_DIM_RX_PKTS,},
>> +	{.usec = 8,   .pkts = VIRTNET_DIM_RX_PKTS,},
>> +	{.usec = 64,  .pkts = VIRTNET_DIM_RX_PKTS,},
>> +	{.usec = 128, .pkts = VIRTNET_DIM_RX_PKTS,},
>> +	{.usec = 256, .pkts = VIRTNET_DIM_RX_PKTS,}
>> +};
> This is wrong.
> This way you will have one global table for ALL the virtio devices in
> the system, while Ethtool performs configuration on a per-netdevice basis.
> What you need is to have 1 dim_cq_moder per each virtio netdevice,
> embedded somewhere into its netdev_priv(). Then
> virtio_dim_{rx,tx}_work() will take profiles from there, not the global
> struct. The global struct can stay here as const to initialize default
> per-netdevice params.

You are right. Good catch!

Thanks,
Heng

>> +
>>   static const unsigned long guest_offloads[] = {
>>   	VIRTIO_NET_F_GUEST_TSO4,
>>   	VIRTIO_NET_F_GUEST_TSO6,
> Thanks,
> Olek


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-28  0:32     ` Jakub Kicinski
@ 2024-03-28  2:12       ` Xuan Zhuo
  2024-03-28 16:48         ` Jakub Kicinski
  2024-03-28  2:26       ` Heng Qi
  1 sibling, 1 reply; 14+ messages in thread
From: Xuan Zhuo @ 2024-03-28  2:12 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Heng Qi, netdev, Eric Dumazet, Paolo Abeni, Vladimir Oltean,
	David S. Miller, Jason Wang, Michael S. Tsirkin, vadim.fedorenko,
	Alexander Lobakin

On Wed, 27 Mar 2024 17:32:58 -0700, Jakub Kicinski <kuba@kernel.org> wrote:
> On Wed, 27 Mar 2024 15:45:50 +0100 Alexander Lobakin wrote:
> > > +/* This is copied from NET_DIM_RX_EQE_PROFILES in DIM library */
> > > +#define VIRTNET_DIM_RX_PKTS 256
> > > +static struct dim_cq_moder rx_eqe_conf[] = {
> > > +	{.usec = 1,   .pkts = VIRTNET_DIM_RX_PKTS,},
> > > +	{.usec = 8,   .pkts = VIRTNET_DIM_RX_PKTS,},
> > > +	{.usec = 64,  .pkts = VIRTNET_DIM_RX_PKTS,},
> > > +	{.usec = 128, .pkts = VIRTNET_DIM_RX_PKTS,},
> > > +	{.usec = 256, .pkts = VIRTNET_DIM_RX_PKTS,}
> > > +};
> >
> > This is wrong.
> > This way you will have one global table for ALL the virtio devices in
> > the system, while Ethtool performs configuration on a per-netdevice basis.
> > What you need is to have 1 dim_cq_moder per each virtio netdevice,
> > embedded somewhere into its netdev_priv(). Then
> > virtio_dim_{rx,tx}_work() will take profiles from there, not the global
> > struct. The global struct can stay here as const to initialize default
> > per-netdevice params.
>
> I've been wondering lately if adaptive IRQ moderation isn't exactly
> the kind of heuristic we would be best off deferring to BPF.
> I have done 0 experiments -- are the thresholds enough
> or do more interesting algos come to mind for anyone?


Yes, we are working on this as well.

For netdim, I think profiles are an aspect. In many cases, this can solve many
problems.

It is a good idea to introduce bpf, but what bpf should do is some logical
situations that are not universal. In our practice, modifying profiles can solve
most situations. We also have a small number of scenarios that cannot be handled,
and we hope to use ebpf to solve them.

Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-28  0:32     ` Jakub Kicinski
  2024-03-28  2:12       ` Xuan Zhuo
@ 2024-03-28  2:26       ` Heng Qi
  1 sibling, 0 replies; 14+ messages in thread
From: Heng Qi @ 2024-03-28  2:26 UTC (permalink / raw)
  To: Jakub Kicinski, Alexander Lobakin
  Cc: netdev, Eric Dumazet, Paolo Abeni, Vladimir Oltean,
	David S. Miller, Jason Wang, Michael S. Tsirkin, Xuan Zhuo,
	vadim.fedorenko, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko



在 2024/3/28 上午8:32, Jakub Kicinski 写道:
> On Wed, 27 Mar 2024 15:45:50 +0100 Alexander Lobakin wrote:
>>> +/* This is copied from NET_DIM_RX_EQE_PROFILES in DIM library */
>>> +#define VIRTNET_DIM_RX_PKTS 256
>>> +static struct dim_cq_moder rx_eqe_conf[] = {
>>> +	{.usec = 1,   .pkts = VIRTNET_DIM_RX_PKTS,},
>>> +	{.usec = 8,   .pkts = VIRTNET_DIM_RX_PKTS,},
>>> +	{.usec = 64,  .pkts = VIRTNET_DIM_RX_PKTS,},
>>> +	{.usec = 128, .pkts = VIRTNET_DIM_RX_PKTS,},
>>> +	{.usec = 256, .pkts = VIRTNET_DIM_RX_PKTS,}
>>> +};
>> This is wrong.
>> This way you will have one global table for ALL the virtio devices in
>> the system, while Ethtool performs configuration on a per-netdevice basis.
>> What you need is to have 1 dim_cq_moder per each virtio netdevice,
>> embedded somewhere into its netdev_priv(). Then
>> virtio_dim_{rx,tx}_work() will take profiles from there, not the global
>> struct. The global struct can stay here as const to initialize default
>> per-netdevice params.
> I've been wondering lately if adaptive IRQ moderation isn't exactly
> the kind of heuristic we would be best off deferring to BPF.
> I have done 0 experiments -- are the thresholds enough
> or do more interesting algos come to mind for anyone?

Hi Jakub.

I totally agree with your idea. In order to get the best practices for 
virtio DIM on our DPUs,
I actually spent a lot of energy on tuning, similar to the current 
custom profile list and virtio
ctrlq asynchronousization. There are some other tuning methods that may 
not be applicable to
other manufacturers, so they are cannot be released for the time being.

But anyway, adding a dim_bpf interface similar to native XDP (located at 
the beginning
of net_dim()) I think would be good, if you and BPF maintainers (Cc'd) 
are willing for a new bpf type
to be introduced. dim_bpf allows devices that want to implement best 
practices for adaptive
IRQ moderation to implement custom logic.

Also, I think the existing patches are still necessary, and this 
provides the simplest and most
direct way to tune.

Thanks!




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-27  9:19 ` [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning Heng Qi
  2024-03-27 14:45   ` Alexander Lobakin
@ 2024-03-28  7:27   ` Michael S. Tsirkin
  2024-03-28  7:51     ` Heng Qi
  1 sibling, 1 reply; 14+ messages in thread
From: Michael S. Tsirkin @ 2024-03-28  7:27 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, Jakub Kicinski, Eric Dumazet, Paolo Abeni,
	Vladimir Oltean, David S. Miller, Jason Wang, Xuan Zhuo

On Wed, Mar 27, 2024 at 05:19:06PM +0800, Heng Qi wrote:
> Virtio-net has different types of back-end device
> implementations. In order to effectively optimize
> the dim library's gains for different device
> implementations, let's use the new interface params
> to fine-tune the profile list.
> 
> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> ---
>  drivers/net/virtio_net.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 52 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index e709d44..9b6c727 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -57,6 +57,16 @@
>  
>  #define VIRTNET_DRIVER_VERSION "1.0.0"
>  
> +/* This is copied from NET_DIM_RX_EQE_PROFILES in DIM library */

So maybe move it to a header and reuse?


> +#define VIRTNET_DIM_RX_PKTS 256
> +static struct dim_cq_moder rx_eqe_conf[] = {
> +	{.usec = 1,   .pkts = VIRTNET_DIM_RX_PKTS,},
> +	{.usec = 8,   .pkts = VIRTNET_DIM_RX_PKTS,},
> +	{.usec = 64,  .pkts = VIRTNET_DIM_RX_PKTS,},
> +	{.usec = 128, .pkts = VIRTNET_DIM_RX_PKTS,},
> +	{.usec = 256, .pkts = VIRTNET_DIM_RX_PKTS,}
> +};
> +
>  static const unsigned long guest_offloads[] = {
>  	VIRTIO_NET_F_GUEST_TSO4,
>  	VIRTIO_NET_F_GUEST_TSO6,
> @@ -3584,7 +3594,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
>  		if (!rq->dim_enabled)
>  			continue;
>  
> -		update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
> +		if (dim->profile_ix >= ARRAY_SIZE(rx_eqe_conf))
> +			dim->profile_ix = ARRAY_SIZE(rx_eqe_conf) - 1;
> +
> +		update_moder = rx_eqe_conf[dim->profile_ix];
>  		if (update_moder.usec != rq->intr_coal.max_usecs ||
>  		    update_moder.pkts != rq->intr_coal.max_packets) {
>  			err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
> @@ -3627,6 +3640,34 @@ static int virtnet_should_update_vq_weight(int dev_flags, int weight,
>  	return 0;
>  }
>  
> +static int virtnet_update_profile(struct virtnet_info *vi,
> +				  struct kernel_ethtool_coalesce *kc)
> +{
> +	int i;
> +
> +	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) {
> +		for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++)
> +			if (kc->rx_eqe_profs[i].comps)
> +				return -EINVAL;
> +	} else {
> +		for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++) {
> +			if (kc->rx_eqe_profs[i].usec != rx_eqe_conf[i].usec ||
> +			    kc->rx_eqe_profs[i].pkts != rx_eqe_conf[i].pkts ||
> +			    kc->rx_eqe_profs[i].comps)
> +				return -EINVAL;
> +		}
> +
> +		return 0;
> +	}
> +
> +	for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++) {
> +		rx_eqe_conf[i].usec = kc->rx_eqe_profs[i].usec;
> +		rx_eqe_conf[i].pkts = kc->rx_eqe_profs[i].pkts;
> +	}
> +
> +	return 0;
> +}
> +
>  static int virtnet_set_coalesce(struct net_device *dev,
>  				struct ethtool_coalesce *ec,
>  				struct kernel_ethtool_coalesce *kernel_coal,
> @@ -3653,6 +3694,10 @@ static int virtnet_set_coalesce(struct net_device *dev,
>  		}
>  	}
>  
> +	ret = virtnet_update_profile(vi, kernel_coal);
> +	if (ret)
> +		return ret;
> +
>  	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_NOTF_COAL))
>  		ret = virtnet_send_notf_coal_cmds(vi, ec);
>  	else
> @@ -3689,6 +3734,10 @@ static int virtnet_get_coalesce(struct net_device *dev,
>  			ec->tx_max_coalesced_frames = 1;
>  	}
>  
> +	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL))
> +		memcpy(kernel_coal->rx_eqe_profs, rx_eqe_conf,
> +		       sizeof(rx_eqe_conf));
> +
>  	return 0;
>  }
>  
> @@ -3868,7 +3917,8 @@ static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
>  
>  static const struct ethtool_ops virtnet_ethtool_ops = {
>  	.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
> -		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
> +		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX |
> +		ETHTOOL_COALESCE_RX_EQE_PROFILE,
>  	.get_drvinfo = virtnet_get_drvinfo,
>  	.get_link = ethtool_op_get_link,
>  	.get_ringparam = virtnet_get_ringparam,
> -- 
> 1.8.3.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-27 14:45   ` Alexander Lobakin
  2024-03-28  0:32     ` Jakub Kicinski
  2024-03-28  2:02     ` Heng Qi
@ 2024-03-28  7:27     ` Michael S. Tsirkin
  2 siblings, 0 replies; 14+ messages in thread
From: Michael S. Tsirkin @ 2024-03-28  7:27 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Heng Qi, netdev, Jakub Kicinski, Eric Dumazet, Paolo Abeni,
	Vladimir Oltean, David S. Miller, Jason Wang, Xuan Zhuo

On Wed, Mar 27, 2024 at 03:45:50PM +0100, Alexander Lobakin wrote:
> From: Heng Qi <hengqi@linux.alibaba.com>
> Date: Wed, 27 Mar 2024 17:19:06 +0800
> 
> > Virtio-net has different types of back-end device
> > implementations. In order to effectively optimize
> > the dim library's gains for different device
> > implementations, let's use the new interface params
> > to fine-tune the profile list.
> 
> Nice idea, but
> 
> > 
> > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> > ---
> >  drivers/net/virtio_net.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 52 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index e709d44..9b6c727 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -57,6 +57,16 @@
> >  
> >  #define VIRTNET_DRIVER_VERSION "1.0.0"
> >  
> > +/* This is copied from NET_DIM_RX_EQE_PROFILES in DIM library */
> > +#define VIRTNET_DIM_RX_PKTS 256
> > +static struct dim_cq_moder rx_eqe_conf[] = {
> > +	{.usec = 1,   .pkts = VIRTNET_DIM_RX_PKTS,},
> > +	{.usec = 8,   .pkts = VIRTNET_DIM_RX_PKTS,},
> > +	{.usec = 64,  .pkts = VIRTNET_DIM_RX_PKTS,},
> > +	{.usec = 128, .pkts = VIRTNET_DIM_RX_PKTS,},
> > +	{.usec = 256, .pkts = VIRTNET_DIM_RX_PKTS,}
> > +};
> 
> This is wrong.

Yes I'd expect anything global to be const.

> This way you will have one global table for ALL the virtio devices in
> the system, while Ethtool performs configuration on a per-netdevice basis.
> What you need is to have 1 dim_cq_moder per each virtio netdevice,
> embedded somewhere into its netdev_priv(). Then
> virtio_dim_{rx,tx}_work() will take profiles from there, not the global
> struct. The global struct can stay here as const to initialize default
> per-netdevice params.
> 
> > +
> >  static const unsigned long guest_offloads[] = {
> >  	VIRTIO_NET_F_GUEST_TSO4,
> >  	VIRTIO_NET_F_GUEST_TSO6,
> 
> Thanks,
> Olek


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-28  7:27   ` Michael S. Tsirkin
@ 2024-03-28  7:51     ` Heng Qi
  0 siblings, 0 replies; 14+ messages in thread
From: Heng Qi @ 2024-03-28  7:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, Jakub Kicinski, Eric Dumazet, Paolo Abeni,
	Vladimir Oltean, David S. Miller, Jason Wang, Xuan Zhuo



在 2024/3/28 下午3:27, Michael S. Tsirkin 写道:
> On Wed, Mar 27, 2024 at 05:19:06PM +0800, Heng Qi wrote:
>> Virtio-net has different types of back-end device
>> implementations. In order to effectively optimize
>> the dim library's gains for different device
>> implementations, let's use the new interface params
>> to fine-tune the profile list.
>>
>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
>> ---
>>   drivers/net/virtio_net.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 52 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index e709d44..9b6c727 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -57,6 +57,16 @@
>>   
>>   #define VIRTNET_DRIVER_VERSION "1.0.0"
>>   
>> +/* This is copied from NET_DIM_RX_EQE_PROFILES in DIM library */
> So maybe move it to a header and reuse?

Will do. And plan to put configurable parameters into 'vi'.

Thanks.

>
>
>> +#define VIRTNET_DIM_RX_PKTS 256
>> +static struct dim_cq_moder rx_eqe_conf[] = {
>> +	{.usec = 1,   .pkts = VIRTNET_DIM_RX_PKTS,},
>> +	{.usec = 8,   .pkts = VIRTNET_DIM_RX_PKTS,},
>> +	{.usec = 64,  .pkts = VIRTNET_DIM_RX_PKTS,},
>> +	{.usec = 128, .pkts = VIRTNET_DIM_RX_PKTS,},
>> +	{.usec = 256, .pkts = VIRTNET_DIM_RX_PKTS,}
>> +};
>> +
>>   static const unsigned long guest_offloads[] = {
>>   	VIRTIO_NET_F_GUEST_TSO4,
>>   	VIRTIO_NET_F_GUEST_TSO6,
>> @@ -3584,7 +3594,10 @@ static void virtnet_rx_dim_work(struct work_struct *work)
>>   		if (!rq->dim_enabled)
>>   			continue;
>>   
>> -		update_moder = net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
>> +		if (dim->profile_ix >= ARRAY_SIZE(rx_eqe_conf))
>> +			dim->profile_ix = ARRAY_SIZE(rx_eqe_conf) - 1;
>> +
>> +		update_moder = rx_eqe_conf[dim->profile_ix];
>>   		if (update_moder.usec != rq->intr_coal.max_usecs ||
>>   		    update_moder.pkts != rq->intr_coal.max_packets) {
>>   			err = virtnet_send_rx_ctrl_coal_vq_cmd(vi, qnum,
>> @@ -3627,6 +3640,34 @@ static int virtnet_should_update_vq_weight(int dev_flags, int weight,
>>   	return 0;
>>   }
>>   
>> +static int virtnet_update_profile(struct virtnet_info *vi,
>> +				  struct kernel_ethtool_coalesce *kc)
>> +{
>> +	int i;
>> +
>> +	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL)) {
>> +		for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++)
>> +			if (kc->rx_eqe_profs[i].comps)
>> +				return -EINVAL;
>> +	} else {
>> +		for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++) {
>> +			if (kc->rx_eqe_profs[i].usec != rx_eqe_conf[i].usec ||
>> +			    kc->rx_eqe_profs[i].pkts != rx_eqe_conf[i].pkts ||
>> +			    kc->rx_eqe_profs[i].comps)
>> +				return -EINVAL;
>> +		}
>> +
>> +		return 0;
>> +	}
>> +
>> +	for (i = 0; i < NET_DIM_PARAMS_NUM_PROFILES; i++) {
>> +		rx_eqe_conf[i].usec = kc->rx_eqe_profs[i].usec;
>> +		rx_eqe_conf[i].pkts = kc->rx_eqe_profs[i].pkts;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>   static int virtnet_set_coalesce(struct net_device *dev,
>>   				struct ethtool_coalesce *ec,
>>   				struct kernel_ethtool_coalesce *kernel_coal,
>> @@ -3653,6 +3694,10 @@ static int virtnet_set_coalesce(struct net_device *dev,
>>   		}
>>   	}
>>   
>> +	ret = virtnet_update_profile(vi, kernel_coal);
>> +	if (ret)
>> +		return ret;
>> +
>>   	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_NOTF_COAL))
>>   		ret = virtnet_send_notf_coal_cmds(vi, ec);
>>   	else
>> @@ -3689,6 +3734,10 @@ static int virtnet_get_coalesce(struct net_device *dev,
>>   			ec->tx_max_coalesced_frames = 1;
>>   	}
>>   
>> +	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_VQ_NOTF_COAL))
>> +		memcpy(kernel_coal->rx_eqe_profs, rx_eqe_conf,
>> +		       sizeof(rx_eqe_conf));
>> +
>>   	return 0;
>>   }
>>   
>> @@ -3868,7 +3917,8 @@ static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
>>   
>>   static const struct ethtool_ops virtnet_ethtool_ops = {
>>   	.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
>> -		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
>> +		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX |
>> +		ETHTOOL_COALESCE_RX_EQE_PROFILE,
>>   	.get_drvinfo = virtnet_get_drvinfo,
>>   	.get_link = ethtool_op_get_link,
>>   	.get_ringparam = virtnet_get_ringparam,
>> -- 
>> 1.8.3.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-28  2:12       ` Xuan Zhuo
@ 2024-03-28 16:48         ` Jakub Kicinski
  2024-03-29  8:56           ` Heng Qi
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Kicinski @ 2024-03-28 16:48 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Heng Qi, netdev, Eric Dumazet, Paolo Abeni, Vladimir Oltean,
	David S. Miller, Jason Wang, Michael S. Tsirkin, vadim.fedorenko,
	Alexander Lobakin

On Thu, 28 Mar 2024 10:12:10 +0800 Xuan Zhuo wrote:
> For netdim, I think profiles are an aspect. In many cases, this can solve many
> problems.

Okay, but then you should try harder to hide all the config in the core.
The driver should be blissfully unaware that the user is changing 
the settings. It should just continue calling net_dim_get_*moderation().

You can create proper dim_init(), dim_destroy() functions for drivers
to call, instead of doing

	INIT_WORK(&bla->dim.work, my_driver_do_dim_work);

directly. In dim_init() you can hook the dim structure to net_device
and then ethtool code can operation on it without driver involvement.

About the uAPI - please make sure you add the new stuff to
Documentation/netlink/specs/ethtool.yaml
see: https://docs.kernel.org/next/userspace-api/netlink/specs.html

And break up the attributes, please, no raw C structs of this nature:

+	return nla_put(skb, attr_type, sizeof(struct dim_cq_moder) *
+		       NET_DIM_PARAMS_NUM_PROFILES, profs);

They are hard to extend.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-28 16:48         ` Jakub Kicinski
@ 2024-03-29  8:56           ` Heng Qi
  2024-03-29 15:18             ` Jakub Kicinski
  0 siblings, 1 reply; 14+ messages in thread
From: Heng Qi @ 2024-03-29  8:56 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Eric Dumazet, Paolo Abeni, Vladimir Oltean,
	David S. Miller, Jason Wang, Michael S. Tsirkin, vadim.fedorenko,
	Alexander Lobakin, Xuan Zhuo



在 2024/3/29 上午12:48, Jakub Kicinski 写道:
> On Thu, 28 Mar 2024 10:12:10 +0800 Xuan Zhuo wrote:
>> For netdim, I think profiles are an aspect. In many cases, this can solve many
>> problems.
> Okay, but then you should try harder to hide all the config in the core.
> The driver should be blissfully unaware that the user is changing
> the settings. It should just continue calling net_dim_get_*moderation().
>
> You can create proper dim_init(), dim_destroy() functions for drivers
> to call, instead of doing
>
> 	INIT_WORK(&bla->dim.work, my_driver_do_dim_work);
>
> directly. In dim_init() you can hook the dim structure to net_device
> and then ethtool code can operation on it without driver involvement.

Ok. Will try this.

>
> About the uAPI - please make sure you add the new stuff to
> Documentation/netlink/specs/ethtool.yaml
> see: https://docs.kernel.org/next/userspace-api/netlink/specs.html
>
> And break up the attributes, please, no raw C structs of this nature:
>
> +	return nla_put(skb, attr_type, sizeof(struct dim_cq_moder) *
> +		       NET_DIM_PARAMS_NUM_PROFILES, profs);
>
> They are hard to extend.

Sorry, I don't seem to get your point, why does this make extending hard?

Are you referring to specifying ETHTOOL_A_COALESCE_RX_EQE_PROFILE
as a nested array, i.e. having each element explicitly have an attr 
name? or passing the
u16 pointer and length as arguments?

Thanks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning
  2024-03-29  8:56           ` Heng Qi
@ 2024-03-29 15:18             ` Jakub Kicinski
  0 siblings, 0 replies; 14+ messages in thread
From: Jakub Kicinski @ 2024-03-29 15:18 UTC (permalink / raw)
  To: Heng Qi
  Cc: netdev, Eric Dumazet, Paolo Abeni, Vladimir Oltean,
	David S. Miller, Jason Wang, Michael S. Tsirkin, vadim.fedorenko,
	Alexander Lobakin, Xuan Zhuo

On Fri, 29 Mar 2024 16:56:12 +0800 Heng Qi wrote:
> > About the uAPI - please make sure you add the new stuff to
> > Documentation/netlink/specs/ethtool.yaml
> > see: https://docs.kernel.org/next/userspace-api/netlink/specs.html
> >
> > And break up the attributes, please, no raw C structs of this nature:
> >
> > +	return nla_put(skb, attr_type, sizeof(struct dim_cq_moder) *
> > +		       NET_DIM_PARAMS_NUM_PROFILES, profs);
> >
> > They are hard to extend.  
> 
> Sorry, I don't seem to get your point, why does this make extending hard?

It's not possible to make some fields optional later on.
It's also possible that user space will make assumptions about the size
of this struct so we won't be able to add fields.

So it's preferred to render the C struct members as individual netlink 
attributes. Look around ethtool netlink, you'll see there are no
structs dumped.

> Are you referring to specifying ETHTOOL_A_COALESCE_RX_EQE_PROFILE
> as a nested array, i.e. having each element explicitly have an attr 
> name? or passing the u16 pointer and length as arguments?


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-03-29 15:18 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-27  9:19 [PATCH net-next v2 0/2] ethtool: provide the dim profile fine-tuning channel Heng Qi
2024-03-27  9:19 ` [PATCH net-next v2 1/2] ethtool: provide customized dim profile management Heng Qi
2024-03-27  9:19 ` [PATCH net-next v2 2/2] virtio-net: support dim profile fine-tuning Heng Qi
2024-03-27 14:45   ` Alexander Lobakin
2024-03-28  0:32     ` Jakub Kicinski
2024-03-28  2:12       ` Xuan Zhuo
2024-03-28 16:48         ` Jakub Kicinski
2024-03-29  8:56           ` Heng Qi
2024-03-29 15:18             ` Jakub Kicinski
2024-03-28  2:26       ` Heng Qi
2024-03-28  2:02     ` Heng Qi
2024-03-28  7:27     ` Michael S. Tsirkin
2024-03-28  7:27   ` Michael S. Tsirkin
2024-03-28  7:51     ` Heng Qi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.