virtio-dev.lists.oasis-open.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Garzarella <sgarzare@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Bobby Eshleman <bobbyeshleman@gmail.com>,
	 Bobby Eshleman <bobby.eshleman@bytedance.com>,
	virtio-dev@lists.oasis-open.org, cong.wang@bytedance.com,
	 jiang.wang@bytedance.com, cohuck@redhat.com,
	virtualization@lists.linux-foundation.org,  stefanha@redhat.com,
	virtio-comment@lists.oasis-open.org,
	 arseny.krasnov@kaspersky.com
Subject: [virtio-dev] Re: [PATCH] virtio-vsock: add VIRTIO_VSOCK_F_DGRAM feature bit
Date: Wed, 6 Sep 2023 16:28:12 +0200	[thread overview]
Message-ID: <6hwqps3x554jcanr2i6kv7u4jmz3lfwcoesuyzxbr3sh3xsnd7@2embgkcj264f> (raw)
In-Reply-To: <20230902043334-mutt-send-email-mst@kernel.org>

On Sat, Sep 02, 2023 at 04:35:25AM -0400, Michael S. Tsirkin wrote:
>On Sat, Sep 02, 2023 at 04:56:42AM +0000, Bobby Eshleman wrote:
>> On Fri, Sep 01, 2023 at 02:45:14PM +0200, Stefano Garzarella wrote:
>> > On Tue, Aug 29, 2023 at 09:29:45PM +0000, Bobby Eshleman wrote:
>> > > This adds support for datagrams to the virtio-vsock device.
>> > >
>> > > virtio-vsock already supports stream and seqpacket types. The existing
>> > > message types and header fields are extended to support datagrams.
>> > > Semantic differences between the flow types are stated, as well as any
>> > > additional requirements for devices and drivers implementing this
>> > > feature.
>> > >
>> > > Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
>> > > ---
>> > > device-types/vsock/description.tex | 95 +++++++++++++++++++++++++++---
>> > > 1 file changed, 88 insertions(+), 7 deletions(-)
>> > >
>> > > diff --git a/device-types/vsock/description.tex b/device-types/vsock/description.tex
>> > > index 7d91d159872f..638dca8e5da1 100644
>> > > --- a/device-types/vsock/description.tex
>> > > +++ b/device-types/vsock/description.tex
>> > > @@ -20,6 +20,7 @@ \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
>> > > \item[VIRTIO_VSOCK_F_STREAM (0)] stream socket type is supported.
>> > > \item[VIRTIO_VSOCK_F_SEQPACKET (1)] seqpacket socket type is supported.
>> > > \item[VIRTIO_VSOCK_F_NO_IMPLIED_STREAM (2)] stream socket type is not implied.
>> > > +\item[VIRTIO_VSOCK_F_DGRAM (3)] datagram socket type is supported.
>> > > \end{description}
>> > >
>> > > \drivernormative{\subsubsection}{Feature bits}{Device Types / Socket Device / Feature bits}
>> > > @@ -167,17 +168,22 @@ \subsubsection{Addressing}\label{sec:Device Types / Socket Device / Device Opera
>> > > consists of a (cid, port number) tuple. The header fields used for this are
>> > > \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}.
>> > >
>> > > -Currently stream and seqpacket sockets are supported. \field{type} is 1 (VIRTIO_VSOCK_TYPE_STREAM)
>> > > -for stream socket types, and 2 (VIRTIO_VSOCK_TYPE_SEQPACKET) for seqpacket socket types.
>> > > +
>> > > +Currently stream, seqpacket, and datagram sockets are supported. \field{type} is
>> > > +1 (VIRTIO_VSOCK_TYPE_STREAM) for stream socket types, 2 (VIRTIO_VSOCK_TYPE_SEQPACKET) for
>> > > +seqpacket socket types, and 3 (VIRTIO_VSOCK_TYPE_DGRAM) for datagram socket types.
>> > >
>> > > \begin{lstlisting}
>> > > #define VIRTIO_VSOCK_TYPE_STREAM    1
>> > > #define VIRTIO_VSOCK_TYPE_SEQPACKET 2
>> > > +#define VIRTIO_VSOCK_TYPE_DGRAM     3
>> > > \end{lstlisting}
>> > >
>> > > Stream sockets provide in-order, guaranteed, connection-oriented delivery
>> > > without message boundaries. Seqpacket sockets provide in-order, guaranteed,
>> > > -connection-oriented delivery with message and record boundaries.
>> > > +connection-oriented delivery with message and record boundaries. Datagram
>> > > +sockets provide connection-less, best-effort delivery of messages, with no
>> > > +order or reliability guarantees.
>> > >
>> > > \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / Device Operation / Buffer Space Management}
>> > > \field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of
>> > > @@ -203,16 +209,19 @@ \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device /
>> > > previously receiving a VIRTIO_VSOCK_OP_CREDIT_REQUEST packet. This allows
>> > > communicating updates any time a change in buffer space occurs.
>> > >
>> > > +\field{buf_alloc} and \field{fwd_cnt} are reserved for future use by datagram
>> > > +sockets. These fields are not used for datagram buffer space management.
>> > > +
>> > > \drivernormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Buffer Space Management}
>> > > -VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
>> > > -sufficient free buffer space for the payload.
>> > > +For stream and seqpacket flows, VIRTIO_VSOCK_OP_RW data packets MUST only be
>> > > +transmitted when the peer has sufficient free buffer space for the payload.
>> > >
>> > > All packets associated with a stream flow MUST contain valid information in
>> > > \field{buf_alloc} and \field{fwd_cnt} fields.
>> > >
>> > > \devicenormative{\paragraph}{Device Operation: Buffer Space
>> > > Management}{Device Types / Socket Device / Device Operation / Buffer
>> > > Space Management}
>> > > -VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
>> > > -sufficient free buffer space for the payload.
>> > > +For stream and seqpacket flows, VIRTIO_VSOCK_OP_RW data packets MUST only be
>> > > +transmitted when the peer has sufficient free buffer space for the payload.
>> > >
>> > > All packets associated with a stream flow MUST contain valid information in
>> > > \field{buf_alloc} and \field{fwd_cnt} fields.
>> > > @@ -299,6 +308,78 @@ \subsubsection{Seqpacket Sockets}\label{sec:Device Types / Socket Device / Devic
>> > > #define VIRTIO_VSOCK_SEQ_EOR (1 << 1)
>> > > \end{lstlisting}
>> > >
>> > > +\subsubsection{Datagram Sockets}\label{sec:Device Types / Socket Device / Device Operation / Datagram Sockets}
>> > > +
>> > > +\drivernormative{\paragraph}{Device Operation: Packet Fragmentation}{Device Types / Socket Device / Datagram Sockets / Fragmentation}
>> > > +
>> > > +Drivers MAY disassemble packets into smaller fragments. If drivers fragment a
>> > > +packet, they MUST follow the fragmentation rules described in section
>> > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}.
>> > > +
>> > > +Drivers MUST support assembly of received packet fragments according to the
>> > > +fragmentation rules described in section
>> > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram
>> > > Sockets / Fragmentation}.
>> > > +
>> > > +\devicenormative{\paragraph}{Device Operation: Packet Fragmentation}{Device Types / Socket Device / Datagram Sockets / Fragmentation}
>> > > +
>> > > +Devices MAY disassemble packets into smaller fragments. If devices fragment a
>> > > +packet, they MUST follow the fragmentation rules described in section
>> > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}.
>> > > +
>> > > +Devices MUST support assembly of received packet fragments according to the
>> > > +fragmentation rules described in section
>> > > +\ref{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}.
>> > > +
>> > > +\drivernormative{\paragraph}{Device Operation: Packet Dropping}{Device Types / Socket Device / Datagram Sockets / Dropping}
>> > > +
>> > > +The driver MAY drop received packets with no notification to the device. This
>> > > +can happen if, for example, there are insufficient resources or no socket
>> > > +exists for the destination address.
>> > > +
>> > > +\devicenormative{\paragraph}{Device Operation: Packet Dropping}{Device Types / Socket Device / Datagram Sockets / Dropping}
>> > > +
>> > > +The device MAY drop received packets with no notification to the driver. This
>> > > +can happen if, for example, there are insufficient resources or no socket
>> > > +exists for the destination address.
>> >
>> > Should we provide some notification if the socket does not exist at the
>> > destination?
>> >
>>
>> Yes, I think so. I believe a start/stop congestion notification scheme
>> actually manages this issue well.
>>
>> For example, the source begins sending packets to a destination.
>>
>> The destination finds that there exists no socket for that destination
>> address. The destination sends a "stop" notification to the source that
>> contains the address in question. Meanwhile, packets are still coming in
>> but they are being dropped.
>>
>> The source receives the "stop" notification with the address and adds it
>> to the "stopped destinations" list. Any new packet destination address
>> will be compared to that list. Any matches will be dropped before
>> sending (and ideally, before wasting time allocating the packet).
>>
>> Only when a socket is bound to an address that matches a "stopped"
>> address does the destination send a "start" notification to any source
>> it has previusly sent a "stop" notification to.

mmm, keeping the state forever could facilitate a DoS, perhaps we
should provide a timeout after which to try again

>>
>> Once "start" is received, flow may resume as normal.
>
>Again, dropping as control flow tactic has a bunch of problems.
>Blocking senders sounds more reasonable.

Yep, I think so.

>
>
>> > > +
>> > > +\paragraph{Datagram Fragmentation}\label{sec:Device Types / Socket Device / Device Operation / Datagram Sockets / Fragmentation}
>> > > +
>> > > +\field{flags} may have the following bit set:
>> > > +
>> > > +\begin{lstlisting}
>> > > +#define VIRTIO_VSOCK_DGRAM_EOM (1 << 0)
>> > > +\end{lstlisting}
>> > > +
>> > > +When the header \field{flags} field bit VIRTIO_VSOCK_DGRAM_EOM (bit 0) is set,
>> > > +it indicates that the current payload is the end of a datagram fragment
>> > > OR that
>> > > +the current payload is an entire datagram packet.
>> >
>> > In the destination, if we discard some fragments, then could we
>> > reconstruct a different datagram from the one sent?
>> >
>> > Is that anything acceptable?
>> >
>>
>> Dropping fragments should be explicitly disallowed. The sender is
>> explicitly disallowed from NOT placing fragments on the virtqueue, but I
>> see that I am missing the piece that states that they may not be dropped
>> on the receive side.
>>
>> I think it is worth mentioning that implicit in this spec is that
>> socket-to-socket dgram communication is unreliable, but device-to-driver
>> (and vice versa) is still reliable. That is, we can rely at least on the
>> virtqueues to work... and if they fail then the device/driver can simply
>> requeue (think send_pkt_queue in Linux)... so there is some reliability
>> at the lowest layer.

Yep, we should clarify this better.

Thanks,
Stefano


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


  parent reply	other threads:[~2023-09-06 14:28 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230829212945.3420727-1-bobby.eshleman@bytedance.com>
2023-08-29 22:21 ` [virtio-dev] Re: [PATCH] virtio-vsock: add VIRTIO_VSOCK_F_DGRAM feature bit Michael S. Tsirkin
2023-08-30  0:43   ` Bobby Eshleman
2023-09-01 13:31     ` Michael S. Tsirkin
2023-09-02  4:37       ` Bobby Eshleman
2023-09-02  8:41         ` Michael S. Tsirkin
2023-09-02  9:24           ` Bobby Eshleman
2023-09-06  8:17             ` Michael S. Tsirkin
2023-09-02 16:16               ` Bobby Eshleman
2023-09-01 12:45 ` Stefano Garzarella
2023-09-02  4:56   ` Bobby Eshleman
2023-09-02  8:35     ` Michael S. Tsirkin
2023-09-02 11:25       ` Bobby Eshleman
2023-09-06 14:28       ` Stefano Garzarella [this message]
2023-09-02 16:17         ` Bobby Eshleman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6hwqps3x554jcanr2i6kv7u4jmz3lfwcoesuyzxbr3sh3xsnd7@2embgkcj264f \
    --to=sgarzare@redhat.com \
    --cc=arseny.krasnov@kaspersky.com \
    --cc=bobby.eshleman@bytedance.com \
    --cc=bobbyeshleman@gmail.com \
    --cc=cohuck@redhat.com \
    --cc=cong.wang@bytedance.com \
    --cc=jiang.wang@bytedance.com \
    --cc=mst@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-comment@lists.oasis-open.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).