From: Xin Long <lucien.xin@gmail.com>
To: Stefan Metzmacher <metze@samba.org>
Cc: network dev <netdev@vger.kernel.org>,
davem@davemloft.net, kuba@kernel.org,
Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>,
Steve French <smfrench@gmail.com>,
Namjae Jeon <linkinjeon@kernel.org>,
Chuck Lever III <chuck.lever@oracle.com>,
Jeff Layton <jlayton@kernel.org>,
Sabrina Dubroca <sd@queasysnail.net>,
Tyler Fanelli <tfanelli@redhat.com>,
Pengtao He <hepengtao@xiaomi.com>,
"linux-cifs@vger.kernel.org" <linux-cifs@vger.kernel.org>,
Samba Technical <samba-technical@lists.samba.org>
Subject: Re: [RFC PATCH net-next 0/5] net: In-kernel QUIC implementation with Userspace handshake
Date: Thu, 2 May 2024 14:08:14 -0400 [thread overview]
Message-ID: <CADvbK_f-WCKp-_NJYOL=j__kxpFuXraFLst3=aPn6BOvX=o+Qg@mail.gmail.com> (raw)
In-Reply-To: <2365b657-bea4-4527-9fce-ad11c690bde3@samba.org>
On Mon, Apr 29, 2024 at 11:20 AM Stefan Metzmacher <metze@samba.org> wrote:
>
> Hi Xin Long,
>
> >>
> > Just confirmed from other ebpf experts, there are no in-kernel interfaces
> > for loading and interacting with BPF maps/programs(other than from BPF itself).
> >
> > It seems that we have to do this match in QUIC stack. In the latest QUIC
> > code, I added quic_packet_get_alpn(), a 59-line function, to parse ALPNs
> > and then it will search for the listen sock with these ALPNs in
> > quic_sock_lookup().
> >
> > I introduced 'alpn_match' module param, and it can be enabled when loading
> > the module QUIC by:
> >
> > # modprobe quic alpn_match=1
> >
> > You can test it by tests/sample_test in the latest code:
> >
> > Start 3 servers:
> >
> > # ./sample_test server 0.0.0.0 1234 \
> > ./keys/server-key.pem ./keys/server-cert.pem smbd
> > # ./sample_test server 0.0.0.0 1234 \
> > ./keys/server-key.pem ./keys/server-cert.pem h3
> > # ./sample_test server 0.0.0.0 1234 \
> > ./keys/server-key.pem ./keys/server-cert.pem ksmbd
> >
> > Try to connect on clients with:
> >
> > # ./sample_test client 127.0.0.1 1234 ksmbd
> > # ./sample_test client 127.0.0.1 1234 smbd
> > # ./sample_test client 127.0.0.1 1234 h3
> >
> > to see if the corresponding server responds.
> >
> > There might be some concerns but it's also a useful feature that can not
> > be implemented in userland QUICs. The commit is here:
> >
> > https://github.com/lxin/quic/commit/de82f8135f4e9196b503b4ab5b359d88f2b2097f
> >
> > Please check if this is enough for SMB applications.
>
> It look great thanks!
>
> > Note as a listen socket is now identified by [address + port + ALPN] when
> > alpn_match=1, this feature does NOT require SO_REUSEPORT socket option to
> > be set, unless one wants multiple sockets to listen to
> > the same [address + port + ALPN].
>
> I'd argue that this should be the default and be required before listen()
> or maybe before bind(), so that it can return EADDRINUSE. As EADDRINUSE should only
> happen for servers it might be useful to have a QUIC_SOCKOPT_LISTEN_ALPN instead of
> QUIC_SOCKOPT_ALPN. As QUIC_SOCKOPT_ALPN on a client socket should not generate let
> bind() care about the alpn value at all.
The latest patches have made it always do alpn_match in kernel, and also
support multiple ALPNs(split by ',' when setting it via sockopt) on both
server and client side. Feel free to check.
Note that:
1. As you expected, setsockopt(QUIC_SOCKOPT_ALPN) must be called before
listen(), and it will return EADDRINUSE if there's a socket already
listening to the same IP + PORT + ALPN.
2. ALPN bind/match is a *listening* sockets thing, so it checks ALPN only
when adding listening sockets in quic_hash(), and it does ALPN only
when looking up listening sockets in quic_sock_lookup().
By setting ALPNs in client sockets it will ONLY pack these ALPNs into
the Client Initial Packet when starting connecting, no bind/match for
these regular sockets, as these sockets can be found by 4-tuple or
a source_connection_id. bind() doesn't need to care about ALPN for
client/regular socket either.
So it's fine to use QUIC_SOCKOPT_ALPN sockopt for both listen and
regular/client sockets, as in kernel it acts differently on ALPNs
for listening and regular sockets. (sorry for confusing, I could
have moved created another hashtable for listening sockets)
In other word, a listen socket is identified by
local_ip + local_port + ALPN(s)
while a regular socket (represents a quic connection) is identified by:
local_ip + local_port + remote_ip + remote_port
or any of those
source_connection_ids.
3. SO_REUSEPORT is still applied to do some load balance between multiple
processes listening to the same IP + PORT + ALPN, like:
on server:
process A: skA = listen(127.0.0.1:1234:smbd)
process B: skB = listen(127.0.0.1:1234:smbd)
process C: skC = listen(127.0.0.1:1234:smbd)
on client:
connect(127.0.0.1:1234:smbd)
connect(127.0.0.1:1234:smbd)
...
on server it will select the sk among (skA, skB and skC) based on the
source address + port in the request from client.
4. Not sure if multiple ALPNs support are useful to you, here is some
example about how it works:
- Without SO_REUSEPORT set:
On server:
process A: skA = listen(127.0.0.1:1234:smbd,h3,ksmbd)
process B: skB = listen(127.0.0.1:1234:smbd,h3,ksmbd)
listen() in process B fails and returns EADDRINUSE.
- with SO_REUSEPORT set:
On server:
process A: skA = listen(127.0.0.1:1234:smbd,h3,ksmbd)
process B: skB = listen(127.0.0.1:1234:smbd,h3,ksmbd)
listen() in process B works.
- with or without SO_REUSEPORT set:
On server:
process A: skA = listen(127.0.0.1:1234:h3,ksmbd)
process B: skB = listen(127.0.0.1:1234:h3,smbd).
(there's overlap on ALPN list but not exact the same ALPNs)
listen() in process B fails and returns EADDRINUSE.
- the match priority for multiple ALPNs is based on the order on the
client ALPN list:
On server:
process A: skA = listen(127.0.0.1:1234:smbd)
process B: skB = listen(127.0.0.1:1234:h3)
process C: skC = listen(127.0.0.1:1234:ksmbd)
On client:
process X: skX = connect(27.0.0.1:1234:h3,ksmbd,smbd)
skB will be the one selected to accept the connection, as h3 is the
1st ALPN on the client ALPN list 'h3,ksmbd,smbd'.
>
> For listens on tcp you also need to specify an explicit port (at least in order
> to be useful).
>
> And it would mean that all application would use it and not block other applications
> from using an explicit alpn.
>
> Also an module parameter for this means the administrator would have to take care
> of it, which means it might be unuseable if loaded with it.
Agree, already dropped this param.
>
> I hope to find some time in the next weeks to play with this.
> Should be relatively trivial create a prototype for samba's smbd.
Sounds Cool!
Thanks.
prev parent reply other threads:[~2024-05-02 18:08 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1710173427.git.lucien.xin@gmail.com>
2024-03-11 19:53 ` Fwd: [RFC PATCH net-next 0/5] net: In-kernel QUIC implementation with Userspace handshake Xin Long
2024-03-13 8:56 ` Stefan Metzmacher
2024-03-13 16:03 ` Xin Long
2024-03-13 17:28 ` Stefan Metzmacher
2024-03-13 19:39 ` Xin Long
2024-03-14 9:21 ` Stefan Metzmacher
2024-03-14 16:21 ` Xin Long
2024-04-19 14:07 ` Stefan Metzmacher
2024-04-19 18:09 ` Xin Long
2024-04-19 18:51 ` Stefan Metzmacher
2024-04-19 19:19 ` Xin Long
2024-04-20 19:32 ` Xin Long
2024-04-21 19:27 ` Stefan Metzmacher
2024-04-22 20:58 ` Xin Long
2024-04-26 4:58 ` Martin KaFai Lau
2024-04-25 18:06 ` Xin Long
2024-04-29 15:20 ` Stefan Metzmacher
2024-05-02 18:08 ` Xin Long [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CADvbK_f-WCKp-_NJYOL=j__kxpFuXraFLst3=aPn6BOvX=o+Qg@mail.gmail.com' \
--to=lucien.xin@gmail.com \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hepengtao@xiaomi.com \
--cc=jlayton@kernel.org \
--cc=kuba@kernel.org \
--cc=linkinjeon@kernel.org \
--cc=linux-cifs@vger.kernel.org \
--cc=metze@samba.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=samba-technical@lists.samba.org \
--cc=sd@queasysnail.net \
--cc=smfrench@gmail.com \
--cc=tfanelli@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).