From: "Gabriel L. Somlo" <gsomlo@gmail.com>
To: netfilter@vger.kernel.org
Subject: Re: Help/Advice with Ethernet NAT or "hub-mode" bridge
Date: Sat, 1 Apr 2023 14:59:32 -0400 [thread overview]
Message-ID: <ZCh/FHuJ/AJ8Onxz@errol.ini.cmu.edu> (raw)
In-Reply-To: <ZCdWKnTTy+0owQZr@errol.ini.cmu.edu>
Here's my current (working!) solution, but I feel I shouldn't have to
jump to *this* many hoops (see below) to make it work, there should be
an easier less painful way to pull it off! :)
On Fri, Mar 31, 2023 at 05:52:44PM -0400, Gabriel L. Somlo wrote:
> I have several VMs networked together on a cloud-based hypervisor
> solution, where the "vswitch" connecting the VMs enforces a strict
> "one MAC per VM network interface" policy.
>
> Typically, one of the VMs has no problem being the "default gateway"
> on such a "vswitch", serving all other VMs connected to the same
> virtualized "LAN" switch.
>
> In my case, the default gateway is inside a container running inside
> a network simulator on one of the VMs (many containers in that simulation
> are used to connect groups of VMs on this "router's" several interfaces
> across a simulated multi-hop "internet".
>
> The trouble is, if I use the simulator VM's interfaces as bridge ports
> into the simulation, the container-as-default gateway will have its
> traffic dropped by the vswitch outside its host VM. Here's an ASCII
> picture of the setup:
>
> -----------------------------
> VM running simulation |
> |
> sim. node, |
> (container), |
> dflt gateway |
> ----------- - br0 - | -----------------
> | / \ | inter-VM | External VM |
> eth0 + veth0 ens32 +-- vswitch --+ using in-sim |
> Sim.MAC | VM.MAC | | dflt. gateway |
> ----------- | -----------------
> -----------------------------
>
> IOW, the "inter-VM vswitch" only allows <VM.MAC> ethernet frames
> from/to the VM running the simulation.
#1. On the simulator VM, create a veth pair (`vi` facing the container):
ip link add vi0 type veth peer name vo0
#2. create a bridge between "outward" facing `vo0` and `ens32`:
ip link add br0 type bridge
ip link set vo0 master br0
ip link set ens32 master br0
#3. bring up the "outward" facing bridge and its ports:
ip link set dev br0 up
ip link set dev vo0 up
ip link set dev ens32 up
#4. assign `vi0` as the "bridge" interface in the Net.Sim. (e.g., gns3
# or CORE network simulators):
#5. after Net.Sim. starts, we have a situation like the following:
---------------------------------------------
|------------- bXYZ br0 | ---------
|| container | / \ / \ | | other |
|| eth0 + vethXYZ vi0 --- vo0 ens32 + -- vswitch -- + guest |
|| | Pub.MAC | | VM(s) |
|------------- | | ---------
| < controlled by Net.Sim.> | <manual conf>|
| |
| Simulator VM |
---------------------------------------------
#6. Set up "double MAC NAT" allowing container `eth0` to use `Pub.MAC`:
ebtables -t nat -F
ebtables -t nat -A PREROUTING -i ens32 -d <Pub.MAC> \
-j dnat --to-destination de:ad:be:ef:00:01
ebtables -t nat -A POSTROUTING -o ens32 -s de:ad:be:ef:00:01 \
-j snat --to-source <Pub.MAC>
ebtables -t nat -A PREROUTING -i vi0 -d de:ad:be:ef:00:01 \
-j dnat --to-destination <Pub.MAC>
ebtables -t nat -A POSTROUTING -o vi0 -s <Pub.MAC> \
-j snat --to-source de:ad:be:ef:00:01
# NOTE: If traffic arrives on a bridge with a destination MAC belonging
# to one of its own ports (a "permanent" FDB entry), it will not
# be forwarded. Therefore `de:ad:be:ef:00:01` is subtituted for
# <Pub.MAC> on the `vi0` <--> `vo0` link, and NAT-ed back to the
# real <Pub.MAC> after the two bridges have been "tricked" into
# forwarding the frame!
#7. Set <Pub.MAC> as the mac address of the container's `eth0`:
ip link set dev eth0 down
ip link set dev eth0 address <Pub.MAC>
ip link set dev eth0 up
#8. Restart dhcp inside the container, and we're good to go!
# The Net.Sim. can have multiple containers assigned to multiple ens*
# interfaces, with multiple "enclaves" connected to different
# vswitches. Each "enclave" vswitch will see the simulator VM
# communicate using its assigned MAC address, but that traffic will
# actually originate from each respective "passed-through" container.
Anyway, once I realized that:
- a single bridge refuses to forward frames destined to
addresses present as "permanent" in its own fdb,
- snat is only available in POSTROUTING,
- dnat is only available in PREROUTING,
I decided to add an extra bridge hop and translate <Pub.MAC> back and
forth, to allow the inner container `eth0` to also use it, thus
solving the issue of ARP packets having mismatched "inner" and "outer"
mac addresses for the default gateway :)
If anyone else knows of a way to further "dumb down" a bridge to the
point where it can be convinced to ignore its "permanent" fdb entries
when making a forwarding decision, I can further simplify this setup.
Thanks much,
--Gabriel
PS. Figured I'd post my current solution in case anyone else ends up
looking for a neat workaround to a problem similar to mine, assuming
nothing cleaner and simpler becomes known or available :)
prev parent reply other threads:[~2023-04-01 18:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-31 21:52 Help/Advice with Ethernet NAT or "hub-mode" bridge Gabriel L. Somlo
[not found] ` <CAJMkcM2nyHUUXtYgQ7UOTN3SLi-i-JuJbCv8V9=8g45FVOCXmA@mail.gmail.com>
2023-03-31 23:27 ` Gabriel L. Somlo
2023-04-01 18:59 ` Gabriel L. Somlo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZCh/FHuJ/AJ8Onxz@errol.ini.cmu.edu \
--to=gsomlo@gmail.com \
--cc=netfilter@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).