Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Module 18: Container Networking
Containers have transformed how we deploy applications, but they bring unique networking challenges. When you run an application directly on a server, networking is straightforward: the app binds to a port, the server has an IP, and traffic flows in. With containers, you have dozens or hundreds of isolated processes sharing a single host, each needing their own network identity, their own IP address, and the ability to discover and talk to other containers — potentially on different hosts. This module covers Docker networking, Kubernetes networking, and service mesh architectures, focusing on the mental models that make container networking predictable rather than magical.Difficulty: Intermediate to Advanced
Prerequisites: Module 10 (NAT), Module 13 (Load Balancing)
18.1 Container Networking Fundamentals
The Challenge
Think of a traditional server as a house with one family. The house has one address, one mailbox, and one phone line. Now imagine turning that house into an apartment building with 50 units. Each apartment needs its own mailbox, its own doorbell, and the ability to receive deliveries independently — all while sharing the same physical building and street address. That is the container networking problem. Each container needs:- Its own network namespace (isolated network stack — its own “apartment” with its own routing table, interfaces, and IP address)
- A unique IP address (its own “unit number”)
- Ability to communicate with other containers (the “intercom system” between apartments)
- (Sometimes) Access to the outside world (a shared “front door” to the street)
Linux Network Namespaces
Containers use Linux network namespaces for isolation. A network namespace is a complete, independent copy of the network stack: its own interfaces, routing tables, iptables rules, and even its own localhost. When a process inside a container callsbind(0.0.0.0:80), it binds to port 80 inside its namespace — not on the host. This is why multiple containers can all listen on port 80 without conflicting.
18.2 Docker Networking Modes
1. Bridge Network (Default)
Containers connect to a virtual bridge — a software-defined network switch inside your host. The bridge (docker0) acts like a physical Ethernet switch: containers are “plugged in” via virtual Ethernet pairs (veth pairs), and the bridge forwards frames between them. The host’s NAT layer (via iptables) gives containers access to the outside world, just like a home router gives your devices internet access.
2. User-Defined Bridge
Better than the default bridge in almost every way — this is what you should use in practice. The critical upgrade: built-in DNS resolution. On the default bridge, containers can only reach each other by IP address (which changes every time a container restarts). On a user-defined bridge, Docker runs an embedded DNS server that resolves container names to IPs automatically.3. Host Network
Container shares the host’s network stack directly — no bridge, no NAT, no network namespace isolation. The container sees the host’s interfaces, the host’s IP, and binds directly to the host’s ports. It is as if you ran the application directly on the host, but in a container for packaging purposes only.4. None Network
Container has no network connectivity at all — not even localhost access to the host. The container gets a loopback interface (127.0.0.1) and nothing else.5. Overlay Network (Multi-Host)
Spans multiple Docker hosts, creating a single logical network across physical machines. Under the hood, overlay networks use VXLAN (Virtual Extensible LAN) tunnels to encapsulate container-to-container traffic inside UDP packets that traverse the underlay network. Think of it as building a private highway system on top of existing roads — the containers think they are on the same local network, but the traffic is actually tunneled across the physical infrastructure.18.3 Port Publishing
Expose container ports to the host. This is how external traffic reaches your containers — by mapping a port on the host to a port inside the container. Under the hood, Docker creates iptables DNAT (Destination NAT) rules that rewrite the destination address of incoming packets from the host IP to the container IP.Port Publishing Flow
18.4 Kubernetes Networking Model
Kubernetes has specific networking requirements, and they are intentionally opinionated. Rather than prescribing a specific implementation, Kubernetes defines three rules that any networking solution must satisfy. These rules create a flat network where every Pod can reach every other Pod directly — no NAT, no port mapping, no surprises:- All Pods can communicate with all other Pods without NAT — any Pod can send a packet to any other Pod’s IP and it arrives unmodified
- All Nodes can communicate with all Pods without NAT — the node (host) can reach any Pod directly, which is essential for health checks and monitoring
- The IP a Pod sees itself as is the same IP others see it as — no hidden NAT translations, which means applications do not need to know they are in a container
Pod Networking
Each Pod gets a unique IP:18.5 Kubernetes Services
Services provide stable endpoints for Pods. Here is the core problem they solve: Pods are ephemeral. They get created, destroyed, rescheduled, and assigned new IPs constantly. If your frontend hardcodes a Pod IP to reach the backend, it breaks the moment that Pod restarts. A Service is an abstraction layer — a stable virtual IP (ClusterIP) and DNS name that never changes, backed by a dynamically updated set of Pod IPs. Think of it like a phone company’s customer service number: you always dial the same number, but the call might be routed to any of a dozen agents in the call center.ClusterIP (Default)
Internal cluster access only:10.96.50.100:80, kube-proxy’s iptables rules intercept it, randomly select one of the backend Pod IPs, and rewrite the destination address. The Pod never actually runs on the ClusterIP — it is a virtual IP that exists only in iptables rules. This is why you cannot ping a ClusterIP (there is no ARP response for it).
NodePort
Exposes the service on a static port on every node’s IP address. This means external clients can reach the service by hitting any node’s IP on that port, even if the Pod is not running on that specific node. Kube-proxy forwards the traffic to the correct node and Pod.LoadBalancer
Provisions a cloud provider’s load balancer (AWS ELB/NLB, GCP Cloud Load Balancer, Azure Load Balancer) and wires it to the NodePort automatically. This is the standard way to expose services to the internet in cloud environments.Headless Service
Returns Pod IPs directly via DNS instead of a virtual ClusterIP. No load balancing, no proxy — the client gets a list of IP addresses and decides which one to connect to. This is essential for stateful workloads where the client needs to connect to a specific Pod (like a specific database replica).nslookup my-headless returns all Pod IPs as A records. When combined with StatefulSets, each Pod gets a predictable DNS name like my-app-0.my-headless, my-app-1.my-headless — which is how databases like PostgreSQL, Cassandra, and Kafka discover their peers.
18.6 CNI (Container Network Interface)
CNI plugins implement the actual networking that makes Kubernetes’ flat network model work. Kubernetes itself does not implement networking — it defines the rules (every Pod gets an IP, every Pod can reach every other Pod) and delegates the implementation to a CNI plugin. Choosing the right CNI plugin is one of the most impactful infrastructure decisions you will make, because it affects performance, security (network policies), observability, and operational complexity.Popular CNI Plugins
| Plugin | Features | Best For |
|---|---|---|
| Calico | Network policies, BGP, IPIP tunnels | General purpose, policy-heavy environments |
| Flannel | Simple overlay, VXLAN | Small clusters, getting started quickly |
| Cilium | eBPF-based, advanced observability, service mesh | Large clusters, security-focused teams |
| Weave | Mesh overlay, built-in encryption | Multi-cloud setups needing encryption |
| AWS VPC CNI | Native VPC IPs for Pods (each Pod gets a real VPC IP) | EKS clusters where you need Pods in VPC security groups |
Calico Example
18.7 Network Policies
Control traffic between Pods — this is the Kubernetes equivalent of security groups. By default, Kubernetes allows all Pods to communicate with all other Pods (the flat network model). Network Policies let you restrict that. They follow a “default allow, explicit deny” model: once you apply any NetworkPolicy that selects a Pod, that Pod switches to “default deny” for the specified direction (ingress, egress, or both), and only traffic matching the policy rules is allowed.18.8 Ingress
Manage external access to services at Layer 7 (HTTP/HTTPS). Ingress solves the “one load balancer per service” cost problem by consolidating all external HTTP routing into a single entry point. The Ingress resource defines routing rules (hostname, path), and an Ingress Controller (a running Pod, typically NGINX, Traefik, or AWS ALB) reads those rules and configures itself to route traffic accordingly. Think of it as a smart receptionist at the front desk who reads the visitor’s badge and directs them to the right department.18.9 Service Mesh
The Problem
As microservices grow from 5 to 50 to 500 services, networking concerns that used to be “someone else’s problem” suddenly become everyone’s problem:- Service discovery — how does Service A find Service B when Pods are created and destroyed constantly?
- Load balancing — kube-proxy does basic round-robin, but you need smarter strategies (weighted, least-connections, retry-aware)
- Encryption (mTLS) — every service-to-service call should be encrypted, but managing certificates across 500 services is a nightmare
- Observability — which service is calling which? What is the latency breakdown across the call chain? Where are the errors?
- Resilience — retries, timeouts, circuit breakers, rate limiting — all the patterns from the microservices playbook
Service Mesh Solution
Inject a sidecar proxy into every Pod. All network traffic in and out of the Pod passes through the sidecar (usually Envoy Proxy), which handles encryption, load balancing, retries, and telemetry transparently. The application code does not change — it still makes plain HTTP calls tohttp://other-service:8080, but the sidecar intercepts the call, encrypts it with mTLS, applies retry policies, collects metrics, and forwards it to the destination’s sidecar.
Popular Service Meshes
| Mesh | Sidecar | Features | Trade-off |
|---|---|---|---|
| Istio | Envoy | Full-featured: mTLS, traffic splitting, fault injection | Complex to operate, significant resource overhead (~100MB RAM per sidecar) |
| Linkerd | linkerd2-proxy (Rust) | Lightweight, simple, fast to adopt | Fewer advanced features than Istio |
| Consul Connect | Envoy | Integrates with HashiCorp Vault, Terraform, Nomad | Best if you are already in the HashiCorp ecosystem |
| AWS App Mesh | Envoy | Native AWS integration, managed control plane | AWS lock-in, less community support |
Istio Architecture
18.10 Debugging Container Networks
Docker
Kubernetes
18.11 Key Takeaways
Pods Get IPs
Services Abstract Pods
CNI Does the Work
Service Mesh for Complex Needs
Course Completion
Congratulations! You’ve completed the Networking Mastery course. You now have deep knowledge of:- IP addressing, subnetting, and CIDR
- NAT and how private networks access the internet
- Routing protocols and how packets find their way
- DNS and domain name resolution
- Load balancing and reverse proxies
- Network troubleshooting tools
- VPNs and secure tunneling
- Firewalls and security groups
- Container and Kubernetes networking
Practice Resources
- Set up a home lab with VMs/containers
- Get hands-on with AWS VPC
- Deploy a Kubernetes cluster and explore networking
- Capture and analyze packets with Wireshark
Interview Deep-Dive
'Explain how a packet travels from one Pod on Node A to another Pod on Node B in a Kubernetes cluster. Walk me through every hop.'
'Explain how a packet travels from one Pod on Node A to another Pod on Node B in a Kubernetes cluster. Walk me through every hop.'
'You deploy a new microservice in Kubernetes and it cannot reach the database service. Both Pods are running and healthy. How do you debug this?'
'You deploy a new microservice in Kubernetes and it cannot reach the database service. Both Pods are running and healthy. How do you debug this?'
kubectl get pods -o wide to confirm both Pods are Running and have IPs. Then kubectl get svc to verify the database Service exists and has the correct type and port. Then kubectl get endpoints db-service to check if the Service has endpoints — if this is empty, the Service selector labels do not match the database Pod labels, which is the most common cause of this issue. I have seen teams spend hours debugging networking when the problem was a typo in a label selector.Second, test connectivity from inside the application Pod. I would exec into the microservice Pod (or deploy a netshoot debug Pod in the same namespace) and try: nslookup db-service to verify DNS resolution, then curl -v db-service:5432 or nc -zv db-service 5432 to test TCP connectivity. If DNS fails, the problem is CoreDNS (check kubectl logs -n kube-system -l k8s-app=kube-dns). If DNS works but the connection times out, the problem is network-level.Third, check Network Policies. This is the most likely culprit in a cluster that has any network policies. I would run kubectl get networkpolicy -n <namespace> to see if there are policies affecting either the microservice or the database. If a NetworkPolicy selects the database Pod, it defaults to deny-all ingress, and there must be an explicit ingress rule allowing traffic from the microservice Pod. I would check the policy’s podSelector, namespaceSelector, and port configuration. A common mistake: the network policy allows traffic from app=backend but the new microservice is labeled app=api — the labels do not match.Fourth, bypass the Service and test direct Pod-to-Pod connectivity. I would ping or curl the database Pod IP directly from the microservice Pod. If this works but the Service does not, the issue is in kube-proxy or the Service configuration (wrong targetPort, wrong selector). If even direct Pod-to-Pod connectivity fails, the issue is at the CNI or node level.Fifth, if cross-node communication fails but same-node works, I would check the CNI plugin. Are the Calico/Cilium/Flannel pods healthy? Are the node-to-node routes correct? Is there a firewall or security group on the cloud provider blocking traffic between nodes on the required ports (Calico BGP uses port 179, VXLAN uses UDP 4789)?Follow-up: “The endpoints list is correct and DNS resolves, but the connection still times out. What next?”I would check whether the database Pod is actually listening on the expected port. Exec into the database Pod and run ss -tlnp to see what ports are open. A common issue: the database is configured to listen on 127.0.0.1:5432 (localhost only) instead of 0.0.0.0:5432 (all interfaces). In a container, if the process binds to localhost, only processes in the same Pod (same network namespace) can connect — traffic from other Pods arrives on the eth0 interface, not loopback. The fix is changing the database bind address to 0.0.0.0 or the Pod’s IP. If the port is correct, I would run tcpdump inside the database Pod to see if packets are arriving — if they arrive but get no response, the issue is the application itself (wrong authentication, connection limit reached, etc.).'Compare Docker's default bridge network with Kubernetes' flat networking model. Why did Kubernetes choose a different approach?'
'Compare Docker's default bridge network with Kubernetes' flat networking model. Why did Kubernetes choose a different approach?'
-p 8080:80, which creates iptables DNAT rules. This works fine for a single developer machine with 5-10 containers, but it creates serious problems at scale.The problems with Docker’s bridge model for orchestration: First, port conflicts — if two containers need port 80, you have to map them to different host ports (8080, 8081), and every client needs to know which host port corresponds to which service. Second, NAT obscures Pod identity — when a container makes an outbound connection, the source IP is the host’s IP (due to SNAT), not the container’s IP. This breaks audit logging, rate limiting, and any policy based on source identity. Third, multi-host communication requires complex overlay setup or manual port forwarding.Kubernetes’ flat network model eliminates all of these problems by design. Every Pod gets a unique, routable IP. No NAT between Pods means the source IP is preserved — if Pod A calls Pod B, Pod B sees Pod A’s real IP. No port mapping means every Pod can use whatever port it wants (even port 80) without conflicts, because each Pod has its own IP address. Service discovery becomes a simple DNS lookup instead of “which host, which port.”The trade-off is complexity in the network layer. The CNI plugin has to solve a harder problem: giving every Pod a routable IP across all nodes. This requires either an overlay network (VXLAN/IPIP tunnels, which add encapsulation overhead) or native routing integration (like AWS VPC CNI allocating real VPC IPs, or Calico using BGP to distribute routes). Docker’s bridge model is simpler to implement but does not scale; Kubernetes’ flat model is harder to implement but scales to thousands of nodes.The design decision reflects Kubernetes’ philosophy: push complexity into the infrastructure layer (CNI plugins, written by networking experts) so that application developers get a simple, predictable model (every Pod has an IP, every Pod can reach every other Pod). Google’s internal Borg system used a similar flat network, and the lessons learned there directly informed Kubernetes’ networking requirements.Follow-up: “When would you still use Docker’s bridge networking instead of Kubernetes?”For local development, CI/CD pipelines, and small single-host deployments where the operational overhead of Kubernetes is not justified. Docker Compose with user-defined bridges gives you DNS-based service discovery, network isolation between projects, and simple port publishing — everything you need for a development environment or a small production deployment with 5-10 services on a single server. The break-even point where Kubernetes networking starts paying for its complexity is roughly when you need multi-host orchestration, auto-scaling, or zero-downtime deployments.'What is a service mesh, when would you adopt one, and when is it overkill? Give me specific criteria.'
'What is a service mesh, when would you adopt one, and when is it overkill? Give me specific criteria.'
'Your Kubernetes service has intermittent 5xx errors. About 10% of requests fail while 90% succeed. All Pods show as healthy. What is happening and how do you investigate?'
'Your Kubernetes service has intermittent 5xx errors. About 10% of requests fail while 90% succeed. All Pods show as healthy. What is happening and how do you investigate?'
kubectl get endpoints) and then look at per-Pod metrics. If I have an Ingress controller or service mesh, the access logs show which backend Pod handled each request. If 10% of requests fail and I have 10 Pods, there is a good chance one Pod is the culprit — every request routed to that Pod fails.Second, I would check if the failing Pod is actually healthy or if the health check is too shallow. A common scenario: the readiness probe checks /health which returns 200, but the Pod’s database connection pool is exhausted, so actual requests fail. The health check passes (the endpoint responds) but the Pod cannot serve real traffic. The fix is a deeper health check that verifies downstream dependencies, or better yet, implementing dependency-aware readiness that marks the Pod not-ready when its connection pool is saturated.Third, I would check for a rolling deployment in progress. If a new version is being deployed and the new Pods have a bug, the 10% failure rate might correspond to the fraction of new Pods that have rolled out. I would check kubectl rollout status deployment/<name> and kubectl get pods to see if some Pods are running a different image version.Fourth, I would investigate network-level issues. If the failures are timeouts (not application errors), one possible cause is a node with network problems. If the failing Pod is on a node with a degraded network link, all requests routed to that Pod time out. I would correlate the failed request destinations with node placement. Another possibility: the CNI plugin has issues on one node — Calico Felix crashed, routes are missing, and Pods on that node are partially unreachable.Fifth, I would check for resource limits. If a Pod hits its CPU limit, it gets throttled and responds slowly (which can trigger timeouts and appear as 5xx). If it hits its memory limit, it gets OOM-killed and restarted — during the restart, requests to that Pod fail. kubectl top pods and kubectl describe pod (look for OOMKilled restart reasons) help here.The systematic approach: add request-level logging at the Ingress or service mesh layer that records which backend Pod served each request, the response code, and the latency. Then correlate failed requests to specific Pods, specific nodes, or specific time windows (deployment events, autoscaling events). In my experience, 80% of intermittent failures in Kubernetes are caused by one of three things: a single unhealthy Pod that passes health checks, a rolling deployment with a buggy new version, or resource limits causing throttling or OOM kills.Follow-up: “How would you implement a more robust health check to catch this scenario?”I would implement three levels of health checks. The liveness probe stays simple: TCP check or a basic HTTP GET that verifies the process is running and not deadlocked. The readiness probe becomes deeper: it checks the database connection pool (are connections available?), checks circuit breaker state (are downstream services healthy?), and checks internal queue depth (is the Pod overwhelmed?). If any of these fail, the Pod is marked not-ready and removed from the Service’s endpoint list, so kube-proxy stops sending traffic to it. I would also add a startup probe with a longer timeout for applications that take time to initialize (JVM warmup, cache loading) — this prevents the liveness probe from killing Pods that are still starting up.