Skip to content

NAT and iptables: How a Container Sees the Internet

NAT and iptables: How a Container Sees the Internet

Written by:

Igor Gorovyy
DevOps Engineer Lead & Senior Solutions Architect

LinkedIn


In the previous part we gave the container an IP address through a bridge and veth. But 10.20.0.x is a private address that no router will route. For the container to reach the internet, two things are needed: IP forwarding and NAT.

Two steps to the internet

// In the ensureBridge() function:

// 1. Enable IP forwarding
os.WriteFile("/proc/sys/net/ipv4/ip_forward", []byte("1"), 0644)

// 2. NAT for outgoing traffic
run("iptables", "-t", "nat", "-A", "POSTROUTING",
    "-s", BridgeSubnet, "-j", "MASQUERADE")

Two lines of code. Let's break down what they do.

IP forwarding

By default, Linux doesn't forward packets between network interfaces. Writing 1 to /proc/sys/net/ipv4/ip_forward tells the kernel: "forward packets from one interface to another."

Without this, a packet from the container (10.20.0.2) would reach the bridge (sheep0) but go no further.

MASQUERADE

MASQUERADE is a type of NAT that replaces the packet's source IP with the IP of the host's outgoing interface. When a packet from the container goes out to the internet, its source IP changes from 10.20.0.2 to the host's IP (say, 192.168.1.100). The response comes back to the host, and the kernel knows it needs to forward it to the container.

The packet's journey

sequenceDiagram
    participant C as Container<br/>10.20.0.2
    participant BR as sheep0 bridge<br/>10.20.0.1
    participant NAT as iptables NAT
    participant ETH as host eth0<br/>192.168.1.100
    participant NET as Internet

    C->>BR: src: 10.20.0.2, dst: 8.8.8.8
    BR->>NAT: IP forwarding
    NAT->>ETH: src: 192.168.1.100, dst: 8.8.8.8<br/>(MASQUERADE)
    ETH->>NET: packet to internet

    NET->>ETH: src: 8.8.8.8, dst: 192.168.1.100
    ETH->>NAT: conntrack: this is a reply for the container
    NAT->>BR: src: 8.8.8.8, dst: 10.20.0.2<br/>(de-MASQUERADE)
    BR->>C: packet delivered

The kernel tracks connections via conntrack (connection tracking). It remembers: "a packet from 10.20.0.2:12345 to 8.8.8.8:80 went out through eth0 as 192.168.1.100:54321." When a reply arrives at 192.168.1.100:54321, conntrack maps everything back.

Why MASQUERADE instead of SNAT?

SNAT (Source NAT) requires a specific IP: -j SNAT --to-source 192.168.1.100. MASQUERADE automatically picks the IP of the outgoing interface. This is more convenient when the host IP can change (DHCP, cloud instances).

But there's a trade-off: MASQUERADE is slower than SNAT because it checks the interface IP on every packet. For production with thousands of containers, SNAT is better. For a learning project, MASQUERADE is more convenient.

Default route

In the previous part we set up the route inside the container:

nsRun(pid, "ip", "route", "add", "default", "via", BridgeGateway)

This tells the container: "anything not on the local network, send to 10.20.0.1 (the bridge)." And the bridge already knows what to do next.

Container-to-container communication

Containers in the same subnet (10.20.0.0/16) can talk directly through the bridge. A packet from 10.20.0.2 to 10.20.0.3 never leaves the bridge, because the bridge works like a switch -- it forwards packets between connected ports.

graph LR
    C1["Container 1<br/>10.20.0.2"] --- BR["sheep0 bridge"]
    C2["Container 2<br/>10.20.0.3"] --- BR
    C1 -.->|"direct communication<br/>through bridge"| C2

What's not implemented

Docker has more:
- Port mapping (-p 8080:80) via iptables DNAT
- Multiple networks (bridge networks) for isolation
- DNS resolution between containers by name
- Network policies

Our network is simple: one bridge, one subnet, NAT for internet access. Containers see each other by IP. That's enough to understand how container networking works.

Try it yourself

# Check iptables NAT rules:
sudo iptables -t nat -L POSTROUTING -n
# Check ip_forward:
cat /proc/sys/net/ipv4/ip_forward
# From inside a container, try ping:
ping -c 1 8.8.8.8

Networking works. Next up -- Image Management: how a tar archive becomes a container's filesystem.

Resources

Previous: Bridge Networking | Next: Image Management