netfilter.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alberto <alberto@bersol.info>
To: netfilter@vger.kernel.org
Subject: Docker NFT rules conflict
Date: Tue, 2 Apr 2024 11:06:03 +0200	[thread overview]
Message-ID: <62b7f69a-918d-b9c1-135a-a2052465ced4@bersol.info> (raw)
In-Reply-To: <dfe78721-55ec-c254-db22-20494b4f16e0@bersol.info>

Hi,
I have a Debian Host with FW & GW tasks for my LAN, and some Docker 
containers inside.

I can see that Docker set some NFT rules for it's normal function. Rules 
as these...
--------------------------------------------------------
table ip nat {
          chain DOCKER {
                  iifname "br-fc93beb65b60" counter packets 0 bytes 0 return
                  iifname "docker0" counter packets 0 bytes 0 return
                  iifname != "br-fc93beb65b60" tcp dport 3306 counter 
packets 0 bytes 0 dnat to 172.22.0.33:3306
                  ...
          }
          chain POSTROUTING {
                  type nat hook postrouting priority srcnat; policy accept;
                  oifname != "br-fc93beb65b60" ip saddr 172.22.0.0/24 
counter packets 778 bytes 63705 masquerade
                  oifname != "docker0" ip saddr 172.17.0.0/16 counter 
packets 0 bytes 0 masquerade
                  ip saddr 172.22.0.33 ip daddr 172.22.0.33 tcp dport 
3306 counter packets 0 bytes 0 masquerade
                  ...
          }
          chain PREROUTING {
                  type nat hook prerouting priority dstnat; policy accept;
                  fib daddr type local counter packets 53594 bytes 
3220837 jump DOCKER
          }
          chain OUTPUT {
                  type nat hook output priority -100; policy accept;
                  ip daddr != 127.0.0.0/8 fib daddr type local counter 
packets 0 bytes 0 jump DOCKER
          }
}
table ip filter {
          chain DOCKER {
                  iifname != "br-fc93beb65b60" oifname "br-fc93beb65b60" 
ip daddr 172.22.0.33 tcp dport 3306 counter packets 0 bytes 0 accept
                  ...
          }
          chain DOCKER-ISOLATION-STAGE-1 {
                  iifname "br-fc93beb65b60" oifname != "br-fc93beb65b60" 
counter packets 284122 bytes 276578066 jump DOCKER-ISOLATION-STAGE-2
                  iifname "docker0" oifname != "docker0" counter packets 
0 bytes 0 jump DOCKER-ISOLATION-STAGE-2
                  counter packets 596380 bytes 316649936 return
          }
          chain DOCKER-ISOLATION-STAGE-2 {
                  oifname "br-fc93beb65b60" counter packets 0 bytes 0 drop
                  oifname "docker0" counter packets 0 bytes 0 drop
                  counter packets 284122 bytes 276578066 return
          }
          chain FORWARD {
                  type filter hook forward priority filter; policy drop;
                  counter packets 596379 bytes 316649852 jump DOCKER-USER
                  counter packets 596379 bytes 316649852 jump 
DOCKER-ISOLATION-STAGE-1
                  oifname "br-fc93beb65b60" ct state related,established 
counter packets 245582 bytes 35730696 accept
                  oifname "br-fc93beb65b60" counter packets 54022 bytes 
3255121 jump DOCKER
                  iifname "br-fc93beb65b60" oifname != "br-fc93beb65b60" 
counter packets 284122 bytes 276578066 accept
                  iifname "br-fc93beb65b60" oifname "br-fc93beb65b60" 
counter packets 385 bytes 25909 accept
                  oifname "docker0" ct state related,established counter 
packets 0 bytes 0 accept
                  oifname "docker0" counter packets 0 bytes 0 jump DOCKER
                  iifname "docker0" oifname != "docker0" counter packets 
0 bytes 0 accept
                  iifname "docker0" oifname "docker0" counter packets 0 
bytes 0 accept
          }
          chain DOCKER-USER {
                  counter packets 596380 bytes 316649936 return
          }
}
--------------------------------------------------------

I suppose these rules make NAT and Block functions for Docker 
containers. But I have my own rules for NAT and Block to my Devices in 
my LAN too, for example, with rules as these...
--------------------------------------------------------
table ip alb-nat {
          chain PREROUTING {
                  type nat hook prerouting priority 30; policy accept;
          }
          chain POSTROUTING {
                  type nat hook postrouting priority 30; policy accept;
                  oifname "eth0" ip saddr 192.168.9.0/24 masquerade
          }
}
table inet alb-fw {
          chain BASE_CHECKS {
                  ct state established,related,new accept
                  ct state invalid drop
          }
          chain INPUT {
                  type filter hook input priority filter + 10; policy drop;
                  jump BASE_CHECKS
                  iifname "lo" accept
                  iifname "br0" ip saddr 192.168.9.0/24 counter packets 
0 bytes 0 accept
                  log prefix "[NFTABLES] Denied " flags all
          }
          chain FORWARD {
                  type filter hook forward priority filter + 10; policy 
accept;
                  jump BASE_CHECKS
                  iifname "br0" oifname "eth0" meta l4proto { tcp, udp } 
ip saddr 192.168.9.0/24 accept
          }
          chain OUTPUT {
                  type filter hook output priority filter + 10; policy 
accept;
                  jump BASE_CHECKS
          }
}
--------------------------------------------------------
I can see that Systemd service have a Flush rule, but in STOP phase...

[Service]
Type=oneshot
...
ExecStop=/usr/sbin/nft flush ruleset

they also have the rules in different tables. I guess they shouldn't 
collide. However, when system boots, I can see all rules active, but my 
laptop (it's a LAN device which receive Dynamic DHCP and Internet output 
from Debian Host) cannot ping to outside:

$ ping www.google.com
PING www.google.com (172.217.168.164) 56(84) bytes of data.

If I restart NFTABLES service...

# systemctl restart nftables

Docker tables disappear, and I can ping to outside:
$ ping www.google.com
PING www.google.com (172.217.168.164) 56(84) bytes of data.
64 bytes from mad07s10-in-f4.1e100.net (172.217.168.164): icmp_seq=1 
ttl=115 time=10.4
64 bytes from mad07s10-in-f4.1e100.net (172.217.168.164): icmp_seq=2 
ttl=115 time=10.3
...
If I restart Docker service, Docker tables appears, and ping follow running.

I'm trying to set docker.service as dependency of nftables.service, for 
start docker before nft, of this way:

# cat /lib/systemd/system/nftables.service
[Unit]
Description=nftables
Documentation=man:nft(8) http://wiki.nftables.org
Wants=network-pre.target *** _*docker.service*_ ***
Before=network-pre.target shutdown.target
Conflicts=shutdown.target
DefaultDependencies=no

[Service]
Type=oneshot
RemainAfterExit=yes
StandardInput=null
ProtectSystem=full
ProtectHome=true
ExecStart=/usr/sbin/nft -f /etc/nftables.conf
ExecReload=/usr/sbin/nft -f /etc/nftables.conf
ExecStop=/usr/sbin/nft flush ruleset

but it's not working.
I don't know what's is the best solution for this.

Best Regards,
Alberto


           reply	other threads:[~2024-04-02  9:06 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <dfe78721-55ec-c254-db22-20494b4f16e0@bersol.info>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62b7f69a-918d-b9c1-135a-a2052465ced4@bersol.info \
    --to=alberto@bersol.info \
    --cc=netfilter@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).