Chapter 3: Networking Fundamentals - The VPC
In most cloud platforms, a network is a regional construct. In Google Cloud, the VPC (Virtual Private Cloud) is a global resource. This single architectural difference changes how you design IP ranges, how services in different regions talk to each other, and how you extend your network back to on‑premises. This chapter assumes no prior networking expertise. We will start from basic IP and CIDR concepts, then layer on VPC design, firewalling, Shared VPC, hybrid connectivity, and SRE‑grade patterns.1. From Scratch: IP, Subnets, and CIDR
Before talking about VPCs, we need a solid mental model for IP addressing.1.1 What is an IP Address?
An IP address is like a phone number for a machine:- IPv4 addresses look like
10.0.1.25(four numbers between 0 and 255) - Each address has two parts:
- Network part – which “street” or subnet
- Host part – which “house” on that street
1.2 Private vs Public IPs
Private IP ranges (RFC1918) are:10.0.0.0/8(10.x.x.x)172.16.0.0/12(172.16.x.x – 172.31.x.x)192.168.0.0/16(192.168.x.x)
- Are not routable on the public Internet
- Are reused by many organizations
- Are ideal for internal service‑to‑service communication
- VM instances
- Load balancers
- Certain managed services
1.3 CIDR Notation: 10.0.1.0/24
CIDR (Classless Inter-Domain Routing) expresses a network as:
- Base address:
10.0.1.0 - Prefix length:
/24(how many bits are the network)
/24 means:
- 24 bits for network, 8 bits for hosts
- Total addresses: 2^(32-24) = 256
- Usable addresses: 254 (network + broadcast reserved in many tools)
| CIDR | Hosts (approx) | Typical Use |
|---|---|---|
/32 | 1 | Single host or interface |
/30 | 4 | Point‑to‑point links |
/24 | 256 | Small subnet (one app tier) |
/20 | 4096 | Region‑level subnet |
/16 | 65,536 | Large shared address space |
/20 instead of /24) to avoid running out of addresses when your app succeeds.
2. Global VPCs and Regional Subnets
2.1 The VPC (Global)
A VPC network in GCP is:- A global logical network spanning all regions
- Your private IP address space, routes, and firewall rules
- Fully managed by Andromeda, Google’s SDN fabric
2.2 Subnets (Regional)
Inside a global VPC, you create subnets, each tied to a specific region:- Subnets are regional, not zonal
- Subnets define which IP range is available in that region
- VM instances get an IP address from the subnet in their region
2.3 Growing Without Downtime: Subnet Expansion
One powerful GCP feature: you can expand a subnet’s IP range without downtime, as long as:- The new range is contiguous and within the VPC’s overall IP range
- The new range does not overlap with other subnets
- Start with
10.0.1.0/24(256 addresses) - Later traffic grows; you expand to
10.0.0.0/20(4096 addresses) - Existing VMs keep their IPs; new VMs get addresses in the bigger range
2.4 Private Google Access
Private Google Access allows VMs without public IPs to reach Google APIs and services (Cloud Storage, BigQuery, etc.) over Google’s private backbone:- Enabled per subnet
- Prevents the need for public IPs on internal workloads
- Common in regulated environments
3. Routing Inside a VPC
Every VPC has a routing table that tells Andromeda where to send packets.3.1 System Routes
GCP automatically creates system routes such as:- Subnet routes: one for each subnet, e.g.
10.0.1.0/24viasubnet-us-east1 - Default Internet route:
0.0.0.0/0via the Internet gateway (for resources with public IP)
3.2 Custom Static Routes
For more advanced designs, you add custom static routes, e.g.:- Send
10.20.0.0/16to an on‑prem VPN - Send
192.168.100.0/24to an appliance VM
3.3 Longest Prefix Match (LPM)
When multiple routes match a packet:- The most specific route (largest prefix, e.g.
/24vs/16) wins - If equal, other preferences (such as priority) apply
- Route 1:
10.0.0.0/16→ on‑prem - Route 2:
10.0.1.0/24→ internal service
10.0.1.10 uses Route 2 because /24 is more specific than /16.
4. Firewall Rules: The Real Guardrails
In GCP, the primary security control is the VPC firewall, not an appliance.4.1 Firewall Basics
Each VPC has stateful firewall rules that apply to traffic:- Ingress rules – control incoming traffic to a VM
- Egress rules – control outbound traffic from a VM
- Direction: ingress / egress
- Action: allow / deny (default is deny)
- Priority (lower number = evaluated first)
- Match criteria (source/destination ranges, protocols, ports, targets)
4.2 Targeting by Tags vs Service Accounts
You can attach firewall rules to instances by:- Network tags (string labels on VMs)
- Service accounts (identity‑based targeting)
4.3 Common Firewall Patterns
- Deny by default: Use default deny, then open only required ports
- Tiered access:
webtier accessible from Internet;apptier only fromwebtier;dbtier only fromapptier - SSH/Jumphost: Only a bastion host accepts SSH from the Internet; all other VMs accept SSH only from the bastion’s internal IP
5. Enterprise Connectivity: Shared VPC vs VPC Peering
As your company grows, you’ll have many projects and teams. You need a way to connect them without creating a mess of ad‑hoc tunnels.5.1 Shared VPC: The Enterprise Standard
Shared VPC is the architectural “hub-and-spoke” model for Google Cloud. It allows a central network team to manage resources while allowing application teams to consume them.The Relationship
- Host Project: The “Owner” of the network. It contains the VPC, subnets, firewall rules, and hybrid connectivity (VPN/Interconnect).
- Service Projects: Attached to the Host Project. They “borrow” subnets from the host to deploy VMs, GKE clusters, or Cloud SQL instances.
Administrative Delegation (Least Privilege)
A Principal Engineer uses IAM roles to maintain separation of duties:- Compute Network Admin: In the Host Project. Can modify the network itself.
- Compute Network User: Granted to service project users/service accounts on a per-subnet basis. This allows them to use the subnet without being able to change firewall rules or IP ranges.
SRE Tip: Use Shared VPC to ensure all egress traffic from multiple projects flows through a single set of Cloud NATs or Firewall Appliances for centralized inspection.
5.2 VPC Network Peering (Decentralized)
Concept:- Two independent VPCs establish a peering relationship.
- They exchange routes and can reach each other using private IPs.
- Not transitive: if A peers with B, and B peers with C, A cannot reach C unless A–C peering exists.
- Peering has quotas (e.g., 25 peerings per VPC).
5.3 Choosing Between Shared VPC and Peering
Shared VPC:- Centralized networking team.
- Strong need for governance and control.
- Many small projects requiring consistent network policy.
- Separate organizations or business units.
- Need to connect networks without merging ownership.
- Limited number of connections.
6. Private Service Connect (PSC): The Modern Bridge
While Shared VPC and Peering connect networks, Private Service Connect (PSC) connects services. It is the evolution beyond VPC Peering for service consumption.6.1 The Problem with VPC Peering
VPC Peering has significant limitations in large enterprises:- IP Exhaustion: Both networks must have non-overlapping IP ranges.
- Transitivity: Peering is not transitive (A -> B -> C doesn’t mean A -> C).
- Security: Peering exposes the entire network to the peer.
6.2 How PSC Works: Endpoint-Based Consumption
PSC allows you to reach a service (like a Google API or a third-party managed database) using a Private IP address in your own subnet.- Service Producer: Publishes a service via a Service Attachment.
- Service Consumer: Creates a PSC Endpoint (a private IP) in their VPC.
- The Magic: Andromeda maps that IP directly to the producer’s load balancer. No IP overlaps are required, as traffic undergoes NAT at the PSC boundary.
6.3 PSC for Google APIs
Instead of usingPrivate Google Access, you can create a specific PSC Endpoint for Google APIs (e.g., 10.0.0.100 maps to storage.googleapis.com).
- Control: You can apply firewall rules to the PSC endpoint IP.
- On-Prem Access: Your on-prem servers can reach Google APIs by simply routing to that private IP over VPN/Interconnect.
7. Hybrid Connectivity: Bridging On‑Prem and Cloud
Most real‑world deployments are hybrid: some systems remain on‑premises, others move to GCP.6.1 Cloud VPN (HA VPN)
Cloud VPN securely connects your on‑premises network to your VPC over the Internet.- Speed: Typically 1.5–3 Gbps per tunnel
- Setup time: Minutes
- Availability: 99.99% when using HA VPN (two tunnels in different zones using BGP)
- Quick connectivity for POCs and small workloads
- Backup path for Interconnect
- Cost‑effective connectivity for moderate bandwidth
6.2 Cloud Interconnect (The Fast Lane)
Cloud Interconnect provides private, dedicated connections:-
Dedicated Interconnect:
- Physical fiber connection to Google edge
- Capacities: 10 Gbps, 100 Gbps ports
- Used by large enterprises and latency‑sensitive workloads
-
Partner Interconnect:
- Connect to Google via a service provider (e.g., Equinix, Megaport)
- Capacities from 50 Mbps up to 10 Gbps
- Faster to provision than Dedicated Interconnect
6.3 Cloud Router & BGP Deep Dive
Cloud Router is a control‑plane service that:- Exchanges routes using BGP with your on‑prem router
- Does not forward data packets (Andromeda does that)
- Automatically updates routes when networks change
BGP Communities (Traffic Scope)
You can use BGP communities to influence how Google advertises your routes:15169:10001– Advertise to local region only15169:10002– Advertise to local continent15169:10003– Advertise globally (default)
Route Selection (Simplified)
When choosing between multiple paths:- Longest Prefix Match – most specific prefix wins
- Priority/Cost – lower numeric priority value wins
- VPN: priority 1000
- Interconnect: priority 100
10.20.0.0/16, Interconnect is preferred because 100 < 1000. VPN can act as automatic backup if the Interconnect fails.
8. Cloud NAT: Secure Outbound Access
If your VMs do not have public IP addresses, how do they download software updates or reach external APIs? The answer is Cloud NAT.7.1 What Cloud NAT Does
- Enables outbound Internet access from private IPs
- Does not allow unsolicited inbound connections
- Scales automatically; no single VM acts as a choke‑point
7.2 Cloud NAT Configuration Pattern
- VMs in
private-app-subnethave no public IP addresses - They can still reach package repositories, external APIs, etc.
- Inbound Internet traffic is blocked by design
9. Lab: Architecting a Multi‑Tier Secure Network
In this lab, you will build a production‑grade network topology using:- Custom VPC
- Private subnets
- Cloud NAT
- Firewall rules for tier isolation
9.1 Architecture Overview
We will create:- VPC:
prod-vpc - Subnets:
web-subnet(public‑facing, limited public IPs)app-subnet(private)db-subnet(private, most restricted)
- Cloud NAT for
app-subnetanddb-subnet - Firewall rules enforcing web → app → db traffic flow
9.2 Implementation Steps
9.3 Verification
- Deploy simple VMs in each subnet (using appropriate network tags)
- Verify:
- Internet → web: HTTP/HTTPS allowed
- web → app: port 8080 allowed
- app → db: port 5432 allowed
- db → Internet: outbound allowed via NAT (e.g. OS updates)
- Internet → db: blocked
10. Security at the Edge: Cloud Armor and VPC-SC
While firewalls protect the VM, Cloud Armor and VPC Service Controls protect the perimeter.10.1 Cloud Armor (The WAF)
Cloud Armor is Google’s Distributed Denial of Service (DDoS) and Web Application Firewall (WAF).- L7 Protection: Blocks SQL Injection (SQLi), Cross-Site Scripting (XSS), and other OWASP Top 10 risks.
- Adaptive Protection: Uses machine learning to detect anomalous traffic patterns and suggest firewall rules.
- Pre-configured Rules: Ready-to-use rules for common application stacks (WordPress, Drupal, etc.).
10.2 VPC Service Controls (VPC-SC) Deep Dive
VPC Service Controls (VPC-SC) let you draw a virtual security perimeter around sensitive services (Cloud Storage, BigQuery, etc.). This is a critical enterprise security feature that prevents data exfiltration.10.2.1 How it Works: The Perimeter
When a service is protected by a perimeter:- Identity is not enough: Even if an attacker has your credentials, they cannot access the data if the request comes from outside the perimeter.
- Data Exfiltration Protection: A compromised VM inside the perimeter cannot copy data to a Cloud Storage bucket outside the perimeter.
- Context-Aware Access: You can allow access from specific IP ranges or only from devices that meet security requirements (e.g., encrypted, screen-locked) via Access Context Manager.
10.2.2 Perimeter Types
- Service Perimeter: The standard boundary.
- Perimeter Bridge: Allows services in different perimeters to communicate (e.g., Project A in Perimeter 1 needs to read a BigQuery table in Project B in Perimeter 2).
- Dry-Run Mode: Essential for production. It logs what would have been blocked without actually stopping traffic, allowing you to debug policies before enforcement.
10.2.3 Service Perimeter Configuration (Terraform Example)
11. Advanced Hybrid Connectivity: BGP and Routing Math
11.1 BGP Community Values
GCP uses BGP communities to give you control over how your routes are prioritized and where they are advertised.| Community | Meaning |
|---|---|
15169:10001 | Advertise to Local Region only. |
15169:10002 | Advertise to Local Continent. |
15169:10003 | Advertise Globally (Default). |
11.2 Influencing Inbound Traffic (MED and AS-Path)
When you have two Interconnects (Primary and Backup), how do you ensure Google sends traffic to the Primary?- MED (Multi-Exit Discriminator): A lower MED value is preferred. Set Primary to 100 and Backup to 200.
- AS-Path Prepending: Make the backup path look longer by repeating your AS number multiple times.
11.3 Cloud Router Route Priority Math
Andromeda calculates the base cost of a route. You add a Priority (0-65535).- Formula:
Total Cost = Base Cost + User Priority - Rule: Lower Total Cost wins.
- SRE Tip: Always set your VPN priority higher (e.g., 1000) than your Interconnect priority (e.g., 100) to ensure the VPN is only used as a failover.
12. Network Intelligence Center: The SRE’s Radar
12.1 Connectivity Tests
A static analysis tool that tells you why a packet is being dropped. It simulates the packet path through firewalls, routes, and peering.- Scenario: “Why can’t VM A talk to Cloud SQL?”
- Tool: Runs a trace and identifies that Firewall Rule 105 is blocking ingress.
12.2 Performance Dashboard
Provides real-time and historical latency/packet loss metrics for:- Google-to-Google (Inter-region)
- Google-to-Internet
- Google-to-On-prem (via Interconnect)
12.3 Firewall Insights
Identifies shadowed rules (rules that never get hit because a higher priority rule matches first) and overly permissive rules.13. Private Service Connect (PSC) Deep Dive
PSC is the successor to VPC Peering for service consumption.13.1 Service Producer vs Consumer
- Producer: A service provider (e.g., Snowflake, or your own internal Shared Services team) creates a Service Attachment.
- Consumer: An application team creates a PSC Endpoint (a private IP) in their own VPC.
13.2 Why PSC is Better than Peering
- No IP Overlaps: Both networks can use
10.0.0.0/24. PSC uses NAT to translate the traffic. - Uni-directional: The producer cannot initiate traffic into the consumer’s network.
- Transitive: Unlike peering, PSC endpoints can be reached across VPC peering links.
14. Lab: Implementing a Hub-and-Spoke with Central Firewall
14.1 Architecture
- Hub Project: Contains a Central Firewall VM (Palo Alto/Fortinet) and the Hybrid Connectivity.
- Spoke Projects: Attached to the Hub via Shared VPC.
- Routing: All traffic from Spokes to the Internet or On-prem must be routed through the Hub’s Firewall VM.
14.2 Implementation (Route-Based Redirection)
12. Interview Preparation
Q1: How is a GCP VPC different from a traditional VPC in AWS or Azure?
Q1: How is a GCP VPC different from a traditional VPC in AWS or Azure?
Answer: The primary difference is scope. In GCP, a VPC is a global resource, not regional.
- Global Reach: A single VPC can span all Google regions worldwide.
- Subnets: Subnets are regional. You can have a subnet in
us-east1and another inasia-east1within the same VPC. - Internal Routing: VMs in different regions can communicate using internal IP addresses over Google’s private backbone (B4) without needing VPNs or peering.
- Simplicity: This simplifies global application architecture, as you don’t need to manage complex transit gateways or multiple peerings for global connectivity.
Q2: When would you choose Shared VPC over VPC Network Peering?
Q2: When would you choose Shared VPC over VPC Network Peering?
Q3: Explain how 'Private Google Access' works and why it's a security best practice.
Q3: Explain how 'Private Google Access' works and why it's a security best practice.
Answer: Private Google Access allows VMs that only have internal IP addresses to reach the public APIs of Google services (like Cloud Storage, BigQuery, or Pub/Sub).
- Mechanism: It routes traffic over Google’s internal network to the service endpoints rather than exiting to the public internet.
- Security: It allows you to keep your VMs completely isolated from the internet (no public IPs) while still being able to use managed cloud services.
- Requirement: It must be enabled at the subnet level.
Q4: How does the Andromeda SDN handle firewall rules differently than traditional hardware firewalls?
Q4: How does the Andromeda SDN handle firewall rules differently than traditional hardware firewalls?
Answer: Andromeda implements firewall rules at the vNIC (Virtual Network Interface Card) level of each VM.
- Distributed Enforcement: Rules are enforced on the host machine where the VM runs, before the packet even hits the physical wire.
- No Bottlenecks: Because enforcement is distributed across all hosts, there is no central “choke point” or firewall appliance that can become a performance bottleneck.
- Identity-Aware: GCLB firewalls can use Network Tags or Service Accounts rather than just IP ranges, making security policies dynamic and application-centric.
Q5: A VM in us-central1 needs to talk to a VM in us-east1. What is the routing path, and does it use the public internet?
Q5: A VM in us-central1 needs to talk to a VM in us-east1. What is the routing path, and does it use the public internet?
Answer: No, it does not use the public internet.
- The packet is identified as internal VPC traffic by the Andromeda SDN.
- Andromeda encapsulates the packet and routes it over Google’s B4 global backbone (private fiber).
- The packet travels directly between data centers.
- It is decapsulated at the destination host and delivered to the target VM.