Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Module 6: Application Layer

The Application Layer is where users interact with the network. Everything below this layer (transport, network, data link, physical) exists to support what happens here — delivering web pages, resolving domain names, sending email, streaming video. When you open a browser, you are working at Layer 7.

6.1 HTTP / HTTPS

HyperText Transfer Protocol is the foundation of the web. Every time your browser loads a page, it speaks HTTP (or HTTPS) to the server.
  • Request/Response Model: Client sends a request, Server sends a response. HTTP is stateless — each request is independent. The server does not remember you between requests unless you use cookies or sessions.
  • Methods: GET (retrieve data), POST (send/create data), PUT (update/replace data), DELETE (remove data). These map naturally to CRUD operations.
  • Status Codes: 200 OK (success), 301 Moved Permanently (redirect), 404 Not Found (resource does not exist), 500 Internal Server Error (server crashed). Knowing the common status code families is essential: 2xx = success, 3xx = redirect, 4xx = client error, 5xx = server error.
  • HTTPS: HTTP over TLS (Encrypted). The “S” stands for Secure. All modern websites should use HTTPS. Without it, anyone on the same Wi-Fi network can read your traffic in plain text.

What an HTTP request actually looks like

GET /api/users HTTP/1.1
Host: api.example.com
User-Agent: curl/7.68.0
Accept: application/json
Authorization: Bearer eyJhbGciOiJIUzI1NiJ9...
And the response:
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 127
Cache-Control: max-age=3600

[{"id":1,"name":"Alice"},{"id":2,"name":"Bob"}]
Troubleshooting HTTP: Use curl -v to see the full request and response, including headers and the TLS handshake. When debugging “my API isn’t working,” the response headers and status code almost always tell you what is wrong. A 403 means the server understood your request but refused it (check credentials). A 502 means a proxy or load balancer could not reach the backend server.

6.2 DNS (Domain Name System)

DNS is the phonebook of the internet. It translates human-readable domain names (google.com) into IP addresses (142.250.190.46). Without DNS, you would have to memorize IP addresses for every website — imagine typing 142.250.190.46 instead of google.com. DNS is arguably the most critical application-layer protocol. If DNS is down, the internet feels “broken” even though the underlying network is fine — your browser simply cannot figure out where to send packets.

Resolution Process

  1. Browser Cache: Check local cache. If you visited google.com recently, the browser may already know the IP.
  2. OS Cache: If the browser cache misses, the operating system checks its own DNS cache.
  3. Resolver: If nothing is cached locally, a query goes to your configured DNS resolver (often your ISP’s, or 8.8.8.8 for Google, or 1.1.1.1 for Cloudflare). The resolver does the heavy lifting.
  4. Root Server: The resolver asks a root server “where is .com handled?” There are 13 root server clusters worldwide (each is actually many servers behind anycast).
  5. TLD Server: The resolver asks the .com TLD server “where is google.com handled?” The TLD server returns the authoritative nameserver.
  6. Authoritative Server: The resolver asks Google’s authoritative nameserver “what is the A record for google.com?” and gets back the IP address.
This multi-step process sounds slow, but caching makes it fast. Most DNS queries are answered from cache in under 1 ms. The full resolution chain (all 6 steps) only happens on the first query for a domain after its TTL (time to live) expires. After that, the result is cached at every level.
For a comprehensive deep dive into DNS records, zones, propagation, DNSSEC, and troubleshooting, see Module 12.
DNS Resolution

6.3 DHCP (Dynamic Host Configuration Protocol)

DHCP automatically assigns IP addresses to devices on a network. Without DHCP, you would have to manually configure the IP address, subnet mask, default gateway, and DNS server on every device that connects. In a coffee shop with 50 customers, that would be unmanageable.
  • DORA Process: The four-step handshake that gets your device an IP address:
  1. Discover: Your device broadcasts “Is there a DHCP server out there?” to the entire local network (because it does not have an IP address yet, so it cannot target a specific server).
  2. Offer: A DHCP server responds with “Here is 192.168.1.50, subnet 255.255.255.0, gateway 192.168.1.1, DNS 8.8.8.8. You can use this for 24 hours.”
  3. Request: Your device broadcasts “I would like to accept the offer of 192.168.1.50” (broadcast, because there might be multiple DHCP servers).
  4. Acknowledge: The DHCP server confirms “192.168.1.50 is yours for 24 hours.”

What DHCP provides

ParameterPurposeExample
IP AddressYour device’s identity on the network192.168.1.50
Subnet MaskDefines the local network boundary255.255.255.0
Default GatewayWhere to send traffic destined for other networks192.168.1.1
DNS ServerWhere to resolve domain names8.8.8.8, 1.1.1.1
Lease TimeHow long you keep this IP86400 seconds (24 hours)
Troubleshooting DHCP failures: If your device gets an IP in the 169.254.x.x range (called APIPA or link-local), it means DHCP failed — your device could not find a DHCP server. Check if the DHCP server is running, if the network cable is connected, or if the DHCP pool (available addresses) is exhausted. In cloud environments, DHCP is handled invisibly by the platform, so this is mainly a concern for on-premises networks.

Next Module

Module 7: Network Security

Protecting the network.

Interview Deep-Dive

Strong Answer:
  • If DNS resolvers become unreachable, the internet effectively appears “broken” to users even though the underlying IP network is fully functional. Browsers cannot translate domain names to IP addresses, so every URL typed into the address bar fails. Email delivery stops because MX record lookups fail. API calls between microservices using hostnames time out. Essentially, any system that relies on domain names — which is nearly everything — stops working.
  • However, connections that were already established continue to work because they have already resolved the IP and have active TCP connections. Cached DNS entries also continue to work until their TTL expires. This is why in a DNS outage, some sites still load (cached) while others fail (not cached or TTL expired).
  • The 2016 Dyn DNS attack is the real-world proof of this. A massive DDoS against Dyn (a major DNS provider) took down Twitter, GitHub, Netflix, Reddit, and many other sites. The underlying servers were perfectly healthy — nobody could find them because DNS was down. This is why redundant DNS providers, DNS caching layers, and multi-provider DNS strategies are critical infrastructure decisions, not afterthoughts.
  • As a practical test: if a user reports “the internet is down,” I can distinguish a DNS failure from a network failure in 10 seconds. If ping 8.8.8.8 works but ping google.com fails, it is a DNS problem. If both fail, it is a network problem.
Follow-up: Walk me through the TTL strategy you would use when migrating a production website to a new server IP.The standard pattern has four phases. First, days before the migration, I lower the TTL on the DNS record from its current value (typically 3600 seconds or higher) to something short like 60 seconds. Then I wait at least as long as the old TTL — if it was 3600 seconds, I wait at least one hour. This ensures every cached copy worldwide has expired and refreshed at the new, short TTL. Second, I make the DNS change, updating the A record to the new server IP. Because the TTL is now 60 seconds, caches worldwide will pick up the new IP within about 2 minutes. Third, I verify propagation using dig @8.8.8.8, dig @1.1.1.1, and tools like whatsmydns.net to confirm global propagation. Fourth, after the migration is confirmed stable (I usually wait 24-48 hours), I raise the TTL back to 3600 or higher to reduce DNS query load. The classic mistake is changing the IP without lowering the TTL first. If the TTL was 86400 seconds (24 hours), some users will be stuck hitting the old IP for up to a full day, and there is no way to force a cache flush on resolvers you do not control.
Strong Answer:
  • HTTP/1.1 (1997) supports persistent connections (keep-alive) so you do not need a new TCP handshake for every request, and it supports pipelining (sending multiple requests without waiting for responses). But in practice, pipelining was poorly implemented by browsers and proxies. The main limitation is head-of-line blocking at the HTTP level: the server must send responses in the order requests were received. Browsers work around this by opening 6 parallel TCP connections per domain.
  • HTTP/2 (2015) fixes this with multiplexing: multiple requests and responses are interleaved on a single TCP connection as binary frames. It also adds header compression (HPACK), which reduces overhead significantly since HTTP headers are repetitive across requests, and server push (preemptively sending resources the client will need). The problem is that while HTTP/2 solves application-layer head-of-line blocking, it is still subject to TCP-level head-of-line blocking. If one packet is lost, TCP stalls all streams on that connection until the retransmission arrives.
  • HTTP/3 (2022) solves this by replacing TCP with QUIC over UDP. Each stream has independent loss recovery, so a lost packet on one stream does not block others. QUIC also integrates TLS 1.3 directly, enabling 0-RTT connection establishment for repeat clients. On lossy networks (mobile, satellite), HTTP/3 shows the biggest improvement.
  • In my experience, most production deployments use HTTP/2 between clients and load balancers, and HTTP/1.1 between load balancers and backend servers. HTTP/3 adoption is growing, driven by CDN providers like Cloudflare and cloud providers that support it at the edge.
Follow-up: What is the HSTS header and why does it matter for security?HSTS (HTTP Strict Transport Security) is a response header that tells the browser “never connect to this domain over plain HTTP — always use HTTPS.” Once a browser sees Strict-Transport-Security: max-age=31536000; includeSubDomains, it will automatically redirect any HTTP request to HTTPS for that domain for the next year, without even making the insecure request. This prevents SSL stripping attacks, where a man-in-the-middle downgrades the connection from HTTPS to HTTP. Without HSTS, the first request to a site might be HTTP (if the user types “example.com” without “https://”), and an attacker on the same Wi-Fi could intercept and modify that initial redirect. With HSTS preloading (submitting your domain to browser vendors’ built-in HSTS list), even the very first visit uses HTTPS. The risk with HSTS is that if you ever need to disable HTTPS (certificate emergency), users with cached HSTS entries cannot access your site at all until the max-age expires. This is why you should only enable HSTS after you are confident your HTTPS setup is solid.
Strong Answer:
  • A 502 Bad Gateway means a proxy or load balancer tried to contact the upstream backend server and either got no response, got an invalid response, or the connection was refused. The problem is between the load balancer and the backend, not between the client and the load balancer.
  • My first step is to check load balancer health checks and logs. If the load balancer is marking backends as unhealthy, that tells me the backends are the problem. If all backends are “healthy” but 502s still occur intermittently, it suggests a timing issue — the backend is occasionally too slow to respond within the load balancer’s timeout window.
  • Next, I SSH into a backend server and check: is the application process running (ps aux | grep app)? Is it listening on the expected port (ss -tuln | grep 8080)? Is it bound to the correct interface (0.0.0.0 vs 127.0.0.1)? Is there high CPU or memory pressure (top, free -m)? I also check application logs for errors or exceptions.
  • I would use curl -v from the load balancer’s network to the backend directly to see if the connection succeeds or what error occurs. If I get “Connection refused,” the app is not listening. If I get a timeout, the app is overloaded. If I get a response but it is malformed, the app has a bug.
  • Common root causes I have seen: backend running out of file descriptors (too many open connections), connection pool exhaustion to a downstream database, a memory leak causing the process to be OOM-killed by the kernel, or a deployment that left some instances in a bad state. The load balancer access logs with response times and upstream response codes are the single most useful data source.
Follow-up: What is the difference between a 502 and a 504, and how does that change your troubleshooting approach?A 502 means the load balancer got a bad or no response from the backend — the backend either refused the connection, sent something unparseable, or closed the connection unexpectedly. A 504 means the load balancer timed out waiting for the backend to respond. The distinction changes the investigation: 502 points to the backend being down, crashing, or misbehaving. 504 points to the backend being alive but too slow — maybe a long-running database query, a deadlock, or resource contention. For 504s, I focus on application performance: slow queries, lock contention, external dependency latency. I also check if the load balancer’s timeout is appropriate. If the backend legitimately needs 30 seconds for some operations but the LB timeout is 10 seconds, increasing the timeout (or better yet, making the operation asynchronous) solves the problem. For 502s, I focus on process health, port binding, and whether the backend is actually running.