From: YangYuxi <yx.atom1@gmail.com>
To: wensong@linux-vs.org, horms@verge.net.au, ja@ssi.bg,
pablo@netfilter.org, kadlec@netfilter.org, fw@strlen.de,
davem@davemloft.net, kuba@kernel.org
Cc: netdev@vger.kernel.org, lvs-devel@vger.kernel.org,
netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
linux-kernel@vger.kernel.org, yx.atom1@gmail.com
Subject: [PATCH] ipvs: avoid drop first packet to reuse conntrack
Date: Thu, 11 Jun 2020 18:01:35 +0800 [thread overview]
Message-ID: <20200611100135.GA19243@VM_111_229_centos> (raw)
Since 'commit f719e3754ee2 ("ipvs: drop first packet to
redirect conntrack")', when a new TCP connection meet
the conditions that need reschedule, the first syn packet
is dropped, this cause one second latency for the new
connection, more discussion about this problem can easy
search from google, such as:
1)One second connection delay in masque
https://marc.info/?t=151683118100004&r=1&w=2
2)IPVS low throughput #70747
https://github.com/kubernetes/kubernetes/issues/70747
3)Apache Bench can fill up ipvs service proxy in seconds #544
https://github.com/cloudnativelabs/kube-router/issues/544
4)Additional 1s latency in `host -> service IP -> pod`
https://github.com/kubernetes/kubernetes/issues/90854
The root cause is when the old session is expired, the
conntrack related to the session is dropped by
ip_vs_conn_drop_conntrack. The code is as follows:
```
static void ip_vs_conn_expire(struct timer_list *t)
{
...
if ((cp->flags & IP_VS_CONN_F_NFCT) &&
!(cp->flags & IP_VS_CONN_F_ONE_PACKET)) {
/* Do not access conntracks during subsys cleanup
* because nf_conntrack_find_get can not be used after
* conntrack cleanup for the net.
*/
smp_rmb();
if (ipvs->enable)
ip_vs_conn_drop_conntrack(cp);
}
...
}
```
As the code show, only if the condition (cp->flags & IP_VS_CONN_F_NFCT)
is true, ip_vs_conn_drop_conntrack will be called.
So we solve this bug by following steps:
1) erase the IP_VS_CONN_F_NFCT flag (it is safely because no packets will
use the old session)
2) call ip_vs_conn_expire_now to release the old session, then the related
conntrack will not be dropped
3) then ipvs unnecessary to drop the first syn packet,
it just continue to pass the syn packet to the next process,
create a new ipvs session, and the new session will related to
the old conntrack(which is reopened by conntrack as a new one),
the next whole things is just as normal as that the old session
isn't used to exist.
This patch has been verified on our thousands of kubernets node servers on Tencent Inc.
Signed-off-by: YangYuxi <yx.atom1@gmail.com>
---
net/netfilter/ipvs/ip_vs_core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index aa6a603a2425..2f750145172f 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -2086,11 +2086,11 @@ static int ip_vs_in_icmp_v6(struct netns_ipvs *ipvs, struct sk_buff *skb,
}
if (resched) {
+ if (uses_ct)
+ cp->flags &= ~IP_VS_CONN_F_NFCT;
if (!atomic_read(&cp->n_control))
ip_vs_conn_expire_now(cp);
__ip_vs_conn_put(cp);
- if (uses_ct)
- return NF_DROP;
cp = NULL;
}
}
--
1.8.3.1
next reply other threads:[~2020-06-11 10:01 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-11 10:01 YangYuxi [this message]
-- strict thread matches above, loose matches on Subject: below --
2020-06-12 13:59 [PATCH] ipvs: avoid drop first packet to reuse conntrack YangYuxi
2020-06-11 9:28 YangYuxi
2020-06-11 18:27 ` Julian Anastasov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200611100135.GA19243@VM_111_229_centos \
--to=yx.atom1@gmail.com \
--cc=coreteam@netfilter.org \
--cc=davem@davemloft.net \
--cc=fw@strlen.de \
--cc=horms@verge.net.au \
--cc=ja@ssi.bg \
--cc=kadlec@netfilter.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lvs-devel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=wensong@linux-vs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).