Linux-CIFS Archive mirror
 help / color / mirror / Atom feed
From: Xin Long <lucien.xin@gmail.com>
To: Stefan Metzmacher <metze@samba.org>
Cc: network dev <netdev@vger.kernel.org>,
	davem@davemloft.net, kuba@kernel.org,
	 Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	 Steve French <smfrench@gmail.com>,
	Namjae Jeon <linkinjeon@kernel.org>,
	 Chuck Lever III <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>,
	 Sabrina Dubroca <sd@queasysnail.net>,
	Tyler Fanelli <tfanelli@redhat.com>,
	 Pengtao He <hepengtao@xiaomi.com>,
	 "linux-cifs@vger.kernel.org" <linux-cifs@vger.kernel.org>,
	 Samba Technical <samba-technical@lists.samba.org>
Subject: Re: [RFC PATCH net-next 0/5] net: In-kernel QUIC implementation with Userspace handshake
Date: Thu, 2 May 2024 14:08:14 -0400	[thread overview]
Message-ID: <CADvbK_f-WCKp-_NJYOL=j__kxpFuXraFLst3=aPn6BOvX=o+Qg@mail.gmail.com> (raw)
In-Reply-To: <2365b657-bea4-4527-9fce-ad11c690bde3@samba.org>

On Mon, Apr 29, 2024 at 11:20 AM Stefan Metzmacher <metze@samba.org> wrote:
>
> Hi Xin Long,
>
> >>
> > Just confirmed from other ebpf experts, there are no in-kernel interfaces
> > for loading and interacting with BPF maps/programs(other than from BPF itself).
> >
> > It seems that we have to do this match in QUIC stack. In the latest QUIC
> > code, I added quic_packet_get_alpn(), a 59-line function, to parse ALPNs
> > and then it will search for the listen sock with these ALPNs in
> > quic_sock_lookup().
> >
> > I introduced 'alpn_match' module param, and it can be enabled when loading
> > the module QUIC by:
> >
> >    # modprobe quic alpn_match=1
> >
> > You can test it by tests/sample_test in the latest code:
> >
> >    Start 3 servers:
> >
> >      # ./sample_test server 0.0.0.0 1234 \
> >          ./keys/server-key.pem ./keys/server-cert.pem smbd
> >      # ./sample_test server 0.0.0.0 1234 \
> >          ./keys/server-key.pem ./keys/server-cert.pem h3
> >      # ./sample_test server 0.0.0.0 1234 \
> >          ./keys/server-key.pem ./keys/server-cert.pem ksmbd
> >
> >    Try to connect on clients with:
> >
> >      # ./sample_test client 127.0.0.1 1234 ksmbd
> >      # ./sample_test client 127.0.0.1 1234 smbd
> >      # ./sample_test client 127.0.0.1 1234 h3
> >
> >    to see if the corresponding server responds.
> >
> > There might be some concerns but it's also a useful feature that can not
> > be implemented in userland QUICs. The commit is here:
> >
> > https://github.com/lxin/quic/commit/de82f8135f4e9196b503b4ab5b359d88f2b2097f
> >
> > Please check if this is enough for SMB applications.
>
> It look great thanks!
>
> > Note as a listen socket is now identified by [address + port + ALPN] when
> > alpn_match=1, this feature does NOT require SO_REUSEPORT socket option to
> > be set, unless one wants multiple sockets to listen to
> > the same [address + port + ALPN].
>
> I'd argue that this should be the default and be required before listen()
> or maybe before bind(), so that it can return EADDRINUSE. As EADDRINUSE should only
> happen for servers it might be useful to have a QUIC_SOCKOPT_LISTEN_ALPN instead of
> QUIC_SOCKOPT_ALPN. As QUIC_SOCKOPT_ALPN on a client socket should not generate let
> bind() care about the alpn value at all.
The latest patches have made it always do alpn_match in kernel, and also
support multiple ALPNs(split by ',' when setting it via sockopt) on both
server and client side. Feel free to check.

Note that:
1. As you expected, setsockopt(QUIC_SOCKOPT_ALPN) must be called before
   listen(), and it will return EADDRINUSE if there's a socket already
   listening to the same IP + PORT + ALPN.

2. ALPN bind/match is a *listening* sockets thing, so it checks ALPN only
   when adding listening sockets in quic_hash(), and it does ALPN only
   when looking up listening sockets in quic_sock_lookup().

   By setting ALPNs in client sockets it will ONLY pack these ALPNs into
   the Client Initial Packet when starting connecting, no bind/match for
   these regular sockets, as these sockets can be found by 4-tuple or
   a source_connection_id. bind() doesn't need to care about ALPN for
   client/regular socket either.

   So it's fine to use QUIC_SOCKOPT_ALPN sockopt for both listen and
   regular/client sockets, as in kernel it acts differently on ALPNs
   for listening and regular sockets. (sorry for confusing, I could
   have moved created another hashtable for listening sockets)

   In other word, a listen socket is identified by

        local_ip + local_port + ALPN(s)

   while a regular socket (represents a quic connection) is identified by:

       local_ip + local_port + remote_ip + remote_port

   or any of those

       source_connection_ids.

3. SO_REUSEPORT is still applied to do some load balance between multiple
   processes listening to the same IP + PORT + ALPN, like:

   on server:
   process A: skA = listen(127.0.0.1:1234:smbd)
   process B: skB = listen(127.0.0.1:1234:smbd)
   process C: skC = listen(127.0.0.1:1234:smbd)

   on client:
   connect(127.0.0.1:1234:smbd)
   connect(127.0.0.1:1234:smbd)
   ...

   on server it will select the sk among (skA, skB and skC) based on the
   source address + port in the request from client.

4. Not sure if multiple ALPNs support are useful to you, here is some
   example about how it works:
   - Without SO_REUSEPORT set:

     On server:
     process A: skA = listen(127.0.0.1:1234:smbd,h3,ksmbd)
     process B: skB = listen(127.0.0.1:1234:smbd,h3,ksmbd)

     listen() in process B fails and returns EADDRINUSE.

   - with SO_REUSEPORT set:

     On server:
     process A: skA = listen(127.0.0.1:1234:smbd,h3,ksmbd)
     process B: skB = listen(127.0.0.1:1234:smbd,h3,ksmbd)

     listen() in process B works.

   - with or without SO_REUSEPORT set:

     On server:
     process A: skA = listen(127.0.0.1:1234:h3,ksmbd)
     process B: skB = listen(127.0.0.1:1234:h3,smbd).
     (there's overlap on ALPN list but not exact the same ALPNs)

     listen() in process B fails and returns EADDRINUSE.

   - the match priority for multiple ALPNs is based on the order on the
     client ALPN list:

     On server:
     process A: skA = listen(127.0.0.1:1234:smbd)
     process B: skB = listen(127.0.0.1:1234:h3)
     process C: skC = listen(127.0.0.1:1234:ksmbd)

     On client:
     process X: skX = connect(27.0.0.1:1234:h3,ksmbd,smbd)

     skB will be the one selected to accept the connection, as h3 is the
     1st ALPN on the client ALPN list 'h3,ksmbd,smbd'.

>
> For listens on tcp you also need to specify an explicit port (at least in order
> to be useful).
>
> And it would mean that all application would use it and not block other applications
> from using an explicit alpn.
>
> Also an module parameter for this means the administrator would have to take care
> of it, which means it might be unuseable if loaded with it.
Agree, already dropped this param.

>
> I hope to find some time in the next weeks to play with this.
> Should be relatively trivial create a prototype for samba's smbd.
Sounds Cool!

Thanks.

      reply	other threads:[~2024-05-02 18:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1710173427.git.lucien.xin@gmail.com>
2024-03-11 19:53 ` Fwd: [RFC PATCH net-next 0/5] net: In-kernel QUIC implementation with Userspace handshake Xin Long
2024-03-13  8:56 ` Stefan Metzmacher
2024-03-13 16:03   ` Xin Long
2024-03-13 17:28     ` Stefan Metzmacher
2024-03-13 19:39       ` Xin Long
2024-03-14  9:21         ` Stefan Metzmacher
2024-03-14 16:21           ` Xin Long
2024-04-19 14:07             ` Stefan Metzmacher
2024-04-19 18:09               ` Xin Long
2024-04-19 18:51                 ` Stefan Metzmacher
2024-04-19 19:19                   ` Xin Long
2024-04-20 19:32                     ` Xin Long
2024-04-21 19:27                       ` Stefan Metzmacher
2024-04-22 20:58                         ` Xin Long
2024-04-26  4:58                           ` Martin KaFai Lau
2024-04-25 18:06                         ` Xin Long
2024-04-29 15:20                           ` Stefan Metzmacher
2024-05-02 18:08                             ` Xin Long [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADvbK_f-WCKp-_NJYOL=j__kxpFuXraFLst3=aPn6BOvX=o+Qg@mail.gmail.com' \
    --to=lucien.xin@gmail.com \
    --cc=chuck.lever@oracle.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hepengtao@xiaomi.com \
    --cc=jlayton@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linkinjeon@kernel.org \
    --cc=linux-cifs@vger.kernel.org \
    --cc=metze@samba.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=samba-technical@lists.samba.org \
    --cc=sd@queasysnail.net \
    --cc=smfrench@gmail.com \
    --cc=tfanelli@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).