netfilter.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Gabriel L. Somlo" <gsomlo@gmail.com>
To: netfilter@vger.kernel.org
Subject: Help/Advice with Ethernet NAT or "hub-mode" bridge
Date: Fri, 31 Mar 2023 17:52:42 -0400	[thread overview]
Message-ID: <ZCdWKnTTy+0owQZr@errol.ini.cmu.edu> (raw)

Hi,

I have several VMs networked together on a cloud-based hypervisor
solution, where the "vswitch" connecting the VMs enforces a strict
"one MAC per VM network interface" policy.

Typically, one of the VMs has no problem being the "default gateway"
on such a "vswitch", serving all other VMs connected to the same
virtualized "LAN" switch.

In my case, the default gateway is inside a container running inside
a network simulator on one of the VMs (many containers in that simulation
are used to connect groups of VMs on this "router's" several interfaces
across a simulated multi-hop "internet".

The trouble is, if I use the simulator VM's interfaces as bridge ports
into the simulation, the container-as-default gateway will have its
traffic dropped by the vswitch outside its host VM. Here's an ASCII
picture of the setup: 

-----------------------------
VM running simulation       |
                            |
sim. node,                  |
(container),                |
dflt gateway                |
-----------    - br0 -      |             -----------------
          |   /       \     |  inter-VM   | External VM   |
     eth0 + veth0    ens32  +-- vswitch --+ using in-sim  |
  Sim.MAC |          VM.MAC |             | dflt. gateway |
-----------                 |             -----------------
-----------------------------

IOW, the "inter-VM vswitch" only allows <VM.MAC> ethernet frames
from/to the VM running the simulation.

I've been trying two different approaches:

1. assign VM.MAC to eth0 inside the container, overwriting Sim.MAC
   (e.g., using `ip link set dev eth0 address <VM.MAC>` inside the
   container).

   I find that when I do that, `br0` will drop external incoming
   frames to <VM.MAC> rather than forward them through `veth0`, and
   that I can't find a way to force br0 to forward everything without
   considering its permanent fdb entries.

   If I could force br0 to act more like a hub (forward everything
   ignoring the fdb, learn nothing, ever), I could get frames to
   successfully travel between my container's eth0 and the external
   VMs trying to use it as the default gateway. The frames would
   have ens32's VM.MAC, which would satisfy the restrictive hypervisor
   and vswitch policies.

2. use ebtables to NAT between ens32's VM.MAC and the container's
   eth0's Sim.MAC:

     ebtables -t nat -A PREROUTING \
           -i ens32 -d <VM.MAC> -j dnat --to-destination <Sim.MAC>

     ebtables -t nat -A POSTROUTING \
           -o ens32 -s <Sim.MAC> -j snat --to-source <VM.MAC>

   This will get frames to successfully cross the bridge with the right
   MAC addresses in the Ethernet headers, but breaks ARP:

     - the container replies to arp requests from external VMs, its
       *payload* (inner) MAC address is still Sim.MAC, even though
       the Ethernet frame (outer) source MAC address has been rewritten
       to be VM.MAC.
       The ebtables man page seems to indicate that using the arpreply
       extension might take care of this, but so far I've failed to
       have external arp requests get dropped by adding such a rule,
       and they still somehow obtain the Sim.MAC as their default gateway
       host's associated MAC, and things don't work

    - when the container itself sends out arp requests for external VM's
      mac addresses, it places its own Sim.MAC in the inner source MAC
      field

   Would this be a situation in which I can (should) be able to use
   the NFQUEUE target to be able to "edit" packets myself in userspace?

   There seems to be no NFQUEUE support in ebtables, unlike iptables.
   Is that right, or am I missing something?

   Is there any other way to dynamically "fix up" ARP to match the changes
   made to the "outer" (Ethernet header) MAC addresses?

I've been advised to use a layer-2 VPN solution, but that would break
"realism" for the external client VMs, and, besides, I'm trying to avoid
imposing restrictions and requirements on them, since they're independently
developed and operated, and a "transparent" solution where the default
gateway is on the magic "router" VM, period, would be a huge usability
win.

Any ideas on what I'm missing, doing wrong, or should otherwise be looking
into would be much appreciated!

Thanks,
--Gabriel

             reply	other threads:[~2023-03-31 21:52 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-31 21:52 Gabriel L. Somlo [this message]
     [not found] ` <CAJMkcM2nyHUUXtYgQ7UOTN3SLi-i-JuJbCv8V9=8g45FVOCXmA@mail.gmail.com>
2023-03-31 23:27   ` Help/Advice with Ethernet NAT or "hub-mode" bridge Gabriel L. Somlo
2023-04-01 18:59 ` Gabriel L. Somlo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCdWKnTTy+0owQZr@errol.ini.cmu.edu \
    --to=gsomlo@gmail.com \
    --cc=netfilter@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).