virtio-comment.lists.oasis-open.org archive mirror
 help / color / mirror / Atom feed
From: Heng Qi <hengqi@linux.alibaba.com>
To: "Michael S. Tsirkin" <mst@redhat.com>, Parav Pandit <parav@nvidia.com>
Cc: "virtio-comment@lists.oasis-open.org"
	<virtio-comment@lists.oasis-open.org>,
	"virtio-dev@lists.oasis-open.org"
	<virtio-dev@lists.oasis-open.org>,
	Jason Wang <jasowang@redhat.com>,
	Yuri Benditovich <yuri.benditovich@daynix.com>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Subject: [virtio-comment] Re: [PATCH v16] virtio-net: support inner header hash
Date: Mon, 12 Jun 2023 10:29:13 +0800	[thread overview]
Message-ID: <4d5d44f1-5106-9446-1530-1d295f480e1d@linux.alibaba.com> (raw)
In-Reply-To: <20230611191534-mutt-send-email-mst@kernel.org>



在 2023/6/12 上午7:18, Michael S. Tsirkin 写道:
> On Sun, Jun 11, 2023 at 08:13:58PM +0000, Parav Pandit wrote:
>>> From: Heng Qi <hengqi@linux.alibaba.com>
>>> Sent: Saturday, June 10, 2023 12:11 AM
>>> +The class VIRTIO_NET_CTRL_HASH_TUNNEL has the following commands:
>>> +\begin{itemize}
>>> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_SET: set \field{hash_tunnel_types}
>>> for the device using the virtnet_hash_tunnel_config_set structure, which is
>>> read-only for the driver.
>> Driver issues set command so its read + write for driver.
>> Read-only for the device.
> This is talking about buffers I think. These are read-only or
> write-only, there is no read+write.
>
>>> +\item VIRTIO_NET_CTRL_HASH_TUNNEL_GET: get \field{hash_tunnel_types}
>>> and \field{supported_hash_tunnel_types} from the device using the
>>> virtnet_hash_tunnel_config_get
>>> +      structure, which is write-only for the driver.
>> Device writes it, so
>> s/write-only for the driver/read-only for the driver
>
> Please use terminology consistent with how we describe buffers,
> which is from POV of the device. Thus buffers are
> device read-only or device write-only.

Sure. I'll use this terminology.

Thanks!

>
>>> +\end{itemize}
>>> +
>>> +\subparagraph{Tunnel/Encapsulated packet} \label{sec:Device Types /
>>> +Network Device / Device Operation / Processing of Incoming Packets /
>>> +Hash calculation for incoming packets / Tunnel/Encapsulated packet}
>>> +
>>> +A tunnel packet is encapsulated from the original packet based on the
>>> +tunneling protocol (only a single level of encapsulation is currently
>>> +supported). The encapsulated packet contains an outer header and an inner
>>> header, and the device calculates the hash over either the inner header or the
>>> outer header.
>>> +
>>> +If VIRTIO_NET_F_HASH_TUNNEL is negotiated and a received encapsulated
>>> +packet's outer header matches one of the configured
>>> \field{hash_tunnel_types}, the hash of the inner header is calculated.
>>> +
>>> +Supported encapsulated packet types:
>>> +\begin{itemize}
>>> +\item \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]}: the outer header is over
>>> IPv4 and the inner header is over IPv4. The outer header does not contain the
>>> transport protocol.
>>> +\item \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]}: the outer header is over
>>> IPv4 and the inner header is over IPv4. The outer header does not contain the
>>> transport protocol.
>>> +\item \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]}: the outer header is over
>>> IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header does not
>>> contain the transport protocol.
>>> +\item \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]}: the outer header is
>>> over IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses
>>> UDP as the transport protocol.
>>> +\item \hyperref[intro:vxlan]{[VXLAN]}: the outer header is over IPv4/IPv6 and
>>> the inner header is over IPv4/IPv6. The outer header uses UDP as the transport
>>> protocol.
>>> +\item \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]}: the outer header is over
>>> IPv4/IPv6 and the inner header is over IPv4/IPv6. The outer header uses UDP as
>>> the transport protocol.
>>> +\item \hyperref[intro:geneve]{[GENEVE]}: the outer header is over IPv4/IPv6
>>> and the inner header is over IPv4/IPv6. The outer header uses UDP as the
>>> transport protocol.
>>> +\item \hyperref[intro:ipip]{[IPIP]}: the outer header is over IPv4 and the inner
>>> header is over IPv4. The outer header does not contain the transport protocol.
>>> +\item \hyperref[intro:nvgre]{[NVGRE]}: the outer header is over IPv4/IPv6 and
>>> the inner header is over IPv4/IPv6. The outer header does not contain the
>>> transport protocol.
>>> +\end{itemize}
>>> +
>> It does not matter much, but it may be good to arrange above list where all entries that does not have transport header first.
>> And than protocols with transport header together (vxlan-gpe, genve, nvgre).
>>
>>> +If VIRTIO_NET_HASH_TUNNEL_TYPE_NONE is set or the encapsulation type is
>>> +not included in the configured \field{hash_tunnel_types}, the hash of the outer
>>> header is calculated for the received encapsulated packet.
>>> +
>>> +The hash is calculated for the received non-encapsulated packet as if
>>> VIRTIO_NET_F_HASH_TUNNEL was not negotiated.
>>> +
>>> +\subparagraph{Supported/enabled encapsulation hash types}
>>> +\label{sec:Device Types / Network Device / Device Operation /
>>> +Processing of Incoming Packets / Hash calculation for incoming packets
>>> +/ Supported/enabled encapsulation hash types}
>>> +
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NONE        (1 << 0)
>>> +\end{lstlisting}
>>> +
>>> +Supported encapsulation hash types:
>>> +Hash type applicable for inner payload of the
>>> \hyperref[intro:gre_rfc2784]{[GRE_rfc2784]} packet:
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2784    (1 << 1)
>>> +\end{lstlisting}
>>> +Hash type applicable for inner payload of the
>>> \hyperref[intro:gre_rfc2890]{[GRE_rfc2890]} packet:
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_2890    (1 << 2)
>>> +\end{lstlisting}
>>> +Hash type applicable for inner payload of the
>>> \hyperref[intro:gre_rfc7676]{[GRE_rfc7676]} packet:
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_7676    (1 << 3)
>>> +\end{lstlisting}
>>> +Hash type applicable for inner payload of the
>>> \hyperref[intro:gre_in_udp_rfc8086]{[GRE-in-UDP]} packet:
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GRE_UDP     (1 << 4)
>>> +\end{lstlisting}
>>> +Hash type applicable for inner payload of the \hyperref[intro:vxlan]{[VXLAN]}
>>> packet:
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN       (1 << 5)
>>> +\end{lstlisting}
>>> +Hash type applicable for inner payload of the
>>> \hyperref[intro:vxlan_gpe]{[VXLAN-GPE]} packet:
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_VXLAN_GPE   (1 << 6)
>>> +\end{lstlisting}
>>> +Hash type applicable for inner payload of the
>>> \hyperref[intro:geneve]{[GENEVE]} packet:
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_GENEVE      (1 << 7)
>>> +\end{lstlisting}
>>> +Hash type applicable for inner payload of the \hyperref[intro:ipip]{[IPIP]}
>>> packet:
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_IPIP        (1 << 8)
>>> +\end{lstlisting}
>>> +Hash type applicable for inner payload of the \hyperref[intro:nvgre]{[NVGRE]}
>>> packet:
>>> +\begin{lstlisting}
>>> +#define VIRTIO_NET_HASH_TUNNEL_TYPE_NVGRE       (1 << 9)
>>> +\end{lstlisting}
>>> +
>>> +\subparagraph{Advice}
>>> +Usage scenarios of inner header hash (but not limited to):
>>> +\begin{itemize}
>>> +\item Legacy tunneling protocols that lack entropy in the outer header use
>>> inner header hash to hash flows
>>> +      with the same outer header but different inner headers to different queues
>>> for better-receiving performance.
>>> +\item In scenarios where the same flow passing through different tunnels is
>>> expected to be received in the same queue,
>>> +      warm caches, lessing locking, etc. are optimized to obtain receiving
>>> performance.
>>> +\end{itemize}
>>> +
>> Small rewrite as,
>> to utilize warm caches, to have less locking etc.
>>
>>> +For scenarios with sufficient outer entropy or no inner header hash
>>> requirements, inner header hash may not be needed:
>>> +A tunnel is often expected to isolate the external network from the
>>> +internal one. By completely ignoring entropy in the outer header and
>>> +replacing it with entropy from the inner header, for hash calculations,
>>> +this expectation might be violated to a certain extent, depending on how the
>>> hash is used. When the hash use is limited to RSS queue selection, inner header
>>> hash may have quality of service (QoS) limitations.
>>> +
>>> +Possible mitigations:
>>> +\begin{itemize}
>>> +\item Use a tool with good forwarding performance to keep the receive queue
>>> from filling up.
>> Generally filling up is fine as long as its emptied also.
>> So a rewrite as,
>>
>> s/from filling up/from dropping packet/
>>
>>> +\item If the QoS is unavailable, the driver can set \field{hash_tunnel_types} to
>>> VIRTIO_NET_HASH_TUNNEL_TYPE_NONE
>>> +      to disable inner header hash for encapsulated packets.
>>> +\item Perform appropriate QoS before packets consume the receive buffers of
>>> the receive queues.
>>> +\end{itemize}
>>> +
>>> +\devicenormative{\subparagraph}{Inner Header Hash}{Device Types /
>>> +Network Device / Device Operation / Control Virtqueue / Inner Header
>>> +Hash}
>>> +
>>> +The device MUST calculate the hash on the outer header if the type of
>>> +the received encapsulated packet does not match any value of the configured
>>> \field{hash_tunnel_types}.
>>> +
>>> +The device MUST respond to the VIRTIO_NET_CTRL_HASH_TUNNEL_SET
>>> command
>>> +with VIRTIO_NET_ERR if the device receives an unsupported or unrecognized
>>> VIRTIO_NET_HASH_TUNNEL_TYPE_ flag.
>>> +
>>> +The device MUST provide the values of \field{supported_hash_tunnel_types} if
>>> it offers the VIRTIO_NET_F_HASH_TUNNEL feature.
>>> +
>>> +Upon reset, the device MUST initialize \field{hash_tunnel_type} to 0.
>>> +
>>> +\drivernormative{\subparagraph}{Inner Header Hash}{Device Types /
>>> +Network Device / Device Operation / Control Virtqueue / Inner Header
>>> +Hash}
>>> +
>>> +The driver MUST have negotiated the VIRTIO_NET_F_HASH_TUNNEL feature
>>> when issuing commands VIRTIO_NET_CTRL_HASH_TUNNEL_SET and
>>> VIRTIO_NET_CTRL_HASH_TUNNEL_GET.
>>> +
>>> +The driver MUST ignore the values received from the
>>> VIRTIO_NET_CTRL_HASH_TUNNEL_GET command if the device responds with
>>> VIRTIO_NET_ERR.
>>> +
>>> +The driver MUST NOT set any VIRTIO_NET_HASH_TUNNEL_TYPE_ flags which
>>> are not supported by the device.
>>> +
>>>   \paragraph{Hash reporting for incoming packets}  \label{sec:Device Types /
>>> Network Device / Device Operation / Processing of Incoming Packets / Hash
>>> reporting for incoming packets}
>>>
>>> diff --git a/device-types/net/device-conformance.tex b/device-
>>> types/net/device-conformance.tex
>>> index 54f6783..f88f48b 100644
>>> --- a/device-types/net/device-conformance.tex
>>> +++ b/device-types/net/device-conformance.tex
>>> @@ -14,4 +14,5 @@
>>>   \item \ref{devicenormative:Device Types / Network Device / Device Operation
>>> / Control Virtqueue / Automatic receive steering in multiqueue mode}  \item
>>> \ref{devicenormative:Device Types / Network Device / Device Operation /
>>> Control Virtqueue / Receive-side scaling (RSS) / RSS processing}  \item
>>> \ref{devicenormative:Device Types / Network Device / Device Operation /
>>> Control Virtqueue / Notifications Coalescing}
>>> +\item \ref{devicenormative:Device Types / Network Device / Device
>>> +Operation / Control Virtqueue / Inner Header Hash}
>>>   \end{itemize}
>>> diff --git a/device-types/net/driver-conformance.tex b/device-types/net/driver-
>>> conformance.tex
>>> index 97d0cc1..9d853d9 100644
>>> --- a/device-types/net/driver-conformance.tex
>>> +++ b/device-types/net/driver-conformance.tex
>>> @@ -14,4 +14,5 @@
>>>   \item \ref{drivernormative:Device Types / Network Device / Device Operation
>>> / Control Virtqueue / Offloads State Configuration / Setting Offloads State}
>>> \item \ref{drivernormative:Device Types / Network Device / Device Operation /
>>> Control Virtqueue / Receive-side scaling (RSS) }  \item
>>> \ref{drivernormative:Device Types / Network Device / Device Operation /
>>> Control Virtqueue / Notifications Coalescing}
>>> +\item \ref{drivernormative:Device Types / Network Device / Device
>>> +Operation / Control Virtqueue / Inner Header Hash}
>>>   \end{itemize}
>>> diff --git a/introduction.tex b/introduction.tex index b7155bf..3f34950 100644
>>> --- a/introduction.tex
>>> +++ b/introduction.tex
>>> @@ -102,6 +102,46 @@ \section{Normative References}\label{sec:Normative
>>> References}
>>>       Standards for Efficient Cryptography Group(SECG), ``SEC1: Elliptic Cureve
>>> Cryptography'', Version 1.0, September 2000.
>>>   	\newline\url{https://www.secg.org/sec1-v2.pdf}\\
>>>
>>> +	\phantomsection\label{intro:gre_rfc2784}\textbf{[GRE_rfc2784]} &
>>> +    Generic Routing Encapsulation. This protocol is only specified for IPv4 and
>>> used as either the payload or delivery protocol.
>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc2784/}\\
>>> +	\phantomsection\label{intro:gre_rfc2890}\textbf{[GRE_rfc2890]} &
>>> +    Key and Sequence Number Extensions to GRE \ref{intro:gre_rfc2784}. This
>>> protocol describes extensions by which two fields, Key and
>>> +    Sequence Number, can be optionally carried in the GRE Header
>>> \ref{intro:gre_rfc2784}.
>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2890}\\
>>> +	\phantomsection\label{intro:gre_rfc7676}\textbf{[GRE_rfc7676]} &
>>> +    IPv6 Support for Generic Routing Encapsulation (GRE). This protocol is
>>> specified for IPv6 and used as either the payload or
>>> +    delivery protocol. Note that this does not change the GRE header format or
>>> any behaviors specified by RFC 2784 or RFC 2890.
>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc7676/}\\
>>> +	\phantomsection\label{intro:gre_in_udp_rfc8086}\textbf{[GRE-in-UDP]}
>>> &
>>> +    GRE-in-UDP Encapsulation. This specifies a method of encapsulating
>>> network protocol packets within GRE and UDP headers.
>>> +    This GRE-in-UDP encapsulation allows the UDP source port field to be used
>>> as an entropy field. This protocol is specified for IPv4 and IPv6,
>>> +    and used as either the payload or delivery protocol.
>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc8086}\\
>>> +	\phantomsection\label{intro:vxlan}\textbf{[VXLAN]} &
>>> +    Virtual eXtensible Local Area Network.
>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc7348/}\\
>>> +	\phantomsection\label{intro:vxlan-gpe}\textbf{[VXLAN-GPE]} &
>>> +    Generic Protocol Extension for VXLAN. This protocol describes extending
>>> Virtual eXtensible Local Area Network (VXLAN) via changes to the VXLAN
>>> header.
>>> +	\newline\url{https://www.ietf.org/archive/id/draft-ietf-nvo3-vxlan-gpe-
>>> 12.txt}\\
>>> +	\phantomsection\label{intro:geneve}\textbf{[GENEVE]} &
>>> +    Generic Network Virtualization Encapsulation.
>>> +	\newline\url{https://datatracker.ietf.org/doc/rfc8926/}\\
>>> +	\phantomsection\label{intro:ipip}\textbf{[IPIP]} &
>>> +    IP Encapsulation within IP.
>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc2003}\\
>>> +	\phantomsection\label{intro:nvgre}\textbf{[NVGRE]} &
>>> +    NVGRE: Network Virtualization Using Generic Routing Encapsulation
>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc7637.html}\\
>>> +	\phantomsection\label{intro:IP}\textbf{[IP]} &
>>> +    INTERNET PROTOCOL
>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc791}\\
>>> +	\phantomsection\label{intro:UDP}\textbf{[UDP]} &
>>> +    User Datagram Protocol
>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc768}\\
>>> +	\phantomsection\label{intro:TCP}\textbf{[TCP]} &
>>> +    TRANSMISSION CONTROL PROTOCOL
>>> +	\newline\url{https://www.rfc-editor.org/rfc/rfc793}\\
>>>   \end{longtable}
>>>
>>>   \section{Non-Normative References}
>>> --
>>> 2.19.1.6.gb485710b
>> With above small fixes,
>> Reviewed-by: Parav Pandit <parav@nvidia.com>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


  reply	other threads:[~2023-06-12  2:29 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-10  4:11 [virtio-comment] [PATCH v16] virtio-net: support inner header hash Heng Qi
2023-06-10  4:27 ` Heng Qi
2023-06-11 20:13 ` [virtio-comment] " Parav Pandit
2023-06-11 23:18   ` [virtio-comment] " Michael S. Tsirkin
2023-06-12  2:29     ` Heng Qi [this message]
2023-06-12  2:26   ` Heng Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4d5d44f1-5106-9446-1530-1d295f480e1d@linux.alibaba.com \
    --to=hengqi@linux.alibaba.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=yuri.benditovich@daynix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).