Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Understanding computer science fundamentals helps you write better code, debug efficiently, and make informed architectural decisions. These concepts form the foundation for system design, performance optimization, and debugging complex issues. When a production database query suddenly takes 30 seconds instead of 30 milliseconds, it is your understanding of indexing, memory hierarchy, and I/O that lets you diagnose the problem in minutes rather than hours. When a distributed system silently drops writes under load, it is your grasp of CAP theorem and consistency models that tells you where to look.Operating Systems
Process vs Thread
Process
- Independent execution unit
- Own memory space (isolated)
- Higher overhead to create (~10ms)
- Inter-process communication needed
- Crash isolation (one crash doesn’t affect others)
Thread
- Lightweight execution unit
- Shared memory with parent process
- Lower overhead to create (~1ms)
- Direct memory sharing
- One thread crash can crash entire process
When to Use Processes vs Threads
Think of processes as separate apartments in a building — each has its own kitchen, bathroom, and walls. Residents (data) are fully isolated. Threads are like roommates sharing one apartment — they share the kitchen and bathroom (memory), which is efficient but requires coordination so nobody walks in on someone else.| Scenario | Use Process | Use Thread |
|---|---|---|
| Isolation needed | ✅ Browser tabs (Chrome uses one process per tab so a crashing tab does not take down the whole browser) | ❌ |
| Shared state | ❌ | ✅ Web server workers sharing a connection pool |
| CPU-bound tasks | ✅ Parallel processing (bypasses Python’s GIL) | ✅ (with limitations in languages with a GIL) |
| I/O-bound tasks | ❌ | ✅ Database connections, API calls, file reads |
| Fault tolerance | ✅ Microservices, worker processes | ❌ |
multiprocessing for CPU-bound work (math, image processing, data crunching) and threading or asyncio for I/O-bound work (API calls, database queries, file operations). Getting this wrong is a common source of confusing performance results.
Memory Management
Stack vs Heap Memory
Think of stack memory like a stack of cafeteria trays — you can only add to the top and remove from the top (LIFO), but it is blazingly fast. Heap memory is like a warehouse — you can store anything anywhere, but finding and managing free space takes more effort.| Aspect | Stack | Heap |
|---|---|---|
| Allocation | Automatic (LIFO) — allocated when function is called, freed on return | Manual (malloc/new) — you ask for memory, you free it |
| Speed | Very fast (just move a pointer) | Slower (must search for free blocks, causes fragmentation) |
| Size | Limited (~1-8 MB per thread) | Limited by available RAM (GBs) |
| Scope | Function-local — dies when function returns | Global access via pointers — lives until explicitly freed |
| Thread Safety | Each thread has its own stack — inherently safe | Shared across threads — needs synchronization (locks, atomics) |
| Memory Errors | Stack overflow (infinite recursion, huge arrays) | Memory leaks, dangling pointers, double-free, use-after-free |
Concurrency Fundamentals
Deadlock Conditions (All 4 must be present)
A deadlock is like two people meeting in a narrow hallway — each insists the other go first, and neither moves. In software, this happens when threads hold resources and wait for each other’s resources in a cycle. The four Coffman conditions must ALL be present simultaneously for a deadlock to occur — break any one condition and deadlock becomes impossible.- Mutual Exclusion: Resource can’t be shared (only one thread can hold a lock at a time)
- Hold and Wait: Process holds resource while waiting for another (holding lock A, waiting for lock B)
- No Preemption: Can’t force process to release resource (no timeout, no forced unlock)
- Circular Wait: Circular chain of processes waiting (A waits for B, B waits for A)
Race Condition
A race condition occurs when the correctness of your program depends on the timing of thread execution — like two people editing the same document simultaneously without seeing each other’s changes. The result depends on who saves last.Key Concepts
| Concept | Description | Interview Relevance |
|---|---|---|
| Deadlock | Circular wait for resources | High |
| Race Condition | Unpredictable behavior from timing | Very High |
| Mutex | Binary lock (0 or 1) | High |
| Semaphore | Counter-based lock (0 to N) | High |
| Virtual Memory | Abstraction over physical memory | Medium |
| Context Switching | Saving/restoring process state | Medium |
| Page Fault | Accessing memory not in RAM | Medium |
| Thrashing | Excessive paging, system slowdown | Medium |
Process Scheduling Algorithms
| Algorithm | Description | Pros | Cons |
|---|---|---|---|
| FCFS | First Come First Served | Simple | Convoy effect |
| SJF | Shortest Job First | Optimal avg wait | Starvation possible |
| Round Robin | Time quantum rotation | Fair | High context switching |
| Priority | Higher priority first | Important tasks fast | Starvation |
| Multilevel Queue | Multiple queues with priority | Flexible | Complex |
Networking
OSI Model (Simplified)
The OSI model is like the postal system for the internet. Layer 1 is the road the mail truck drives on. Layer 2 is the local post office that knows about addresses on its street. Layer 3 is the routing system that figures out which city to send the mail to. Layer 4 ensures the package arrives intact (or decides speed matters more than guaranteed delivery). Layer 7 is the content of the letter itself.TCP Three-Way Handshake
TCP vs UDP
TCP
- Connection-oriented (handshake)
- Reliable delivery (ACKs, retransmission)
- Ordered packets (sequence numbers)
- Flow control (sliding window)
- Congestion control
- Use: Web, Email, File transfer, SSH
UDP
- Connectionless (no handshake)
- Best-effort delivery (no guarantees)
- No ordering guarantee
- No flow/congestion control
- Lower latency, less overhead
- Use: Gaming, Streaming, DNS, VoIP
DNS Resolution Process
DNS is the internet’s phone book — it translates human-readable names (google.com) into machine-readable IP addresses (142.250.80.46). The resolution process works like asking for directions: you ask increasingly specific authorities until someone knows the exact answer.HTTP/1.1 vs HTTP/2 vs HTTP/3
| Feature | HTTP/1.1 | HTTP/2 | HTTP/3 |
|---|---|---|---|
| Connection | Multiple TCP | Single TCP, multiplexed | QUIC (UDP-based) |
| Head-of-line blocking | Yes | At TCP level | No |
| Header compression | No | HPACK | QPACK |
| Server push | No | Yes | Yes |
| Encryption | Optional | Practically required | Required (TLS 1.3) |
HTTP Methods & Status Codes
WebSocket vs HTTP
Database Fundamentals
ACID Properties
ACID is the set of guarantees that makes databases trustworthy for critical operations. Without ACID, a power failure during a bank transfer could debit your account without crediting the recipient — money literally vanishes. Every relational database provides these guarantees; the question is which trade-offs you accept (see isolation levels below).| Property | Description | Example |
|---|---|---|
| Atomicity | All or nothing — partial work is rolled back | Bank transfer: debit AND credit succeed or both fail. Never “money disappeared.” |
| Consistency | Database moves from one valid state to another | Balance can’t go negative, foreign keys always reference existing rows |
| Isolation | Concurrent transactions behave as if run sequentially | Two users buying the last item: only one succeeds, the other sees “out of stock” |
| Durability | Committed data survives crashes, power failures | Write-ahead logging (WAL) ensures data on disk before confirming commit |
Transaction Isolation Levels
| Level | Dirty Read | Non-Repeatable Read | Phantom Read | Performance |
|---|---|---|---|---|
| Read Uncommitted | ✅ Possible | ✅ Possible | ✅ Possible | Fastest |
| Read Committed | ❌ Prevented | ✅ Possible | ✅ Possible | Fast |
| Repeatable Read | ❌ Prevented | ❌ Prevented | ✅ Possible | Medium |
| Serializable | ❌ Prevented | ❌ Prevented | ❌ Prevented | Slowest |
SQL vs NoSQL
SQL (Relational)
- Structured schema (tables, rows, columns)
- ACID compliant
- Complex queries (JOINs, aggregations)
- Vertical scaling (scale up)
- Examples: PostgreSQL, MySQL, Oracle
- Use: Transactions, Analytics, Complex relationships
NoSQL
- Flexible schema (documents, key-value, graphs)
- BASE (Eventually consistent)
- Simple queries (by key/ID)
- Horizontal scaling (scale out)
- Examples: MongoDB, Redis, Cassandra, Neo4j
- Use: Real-time, Big Data, Caching, Social graphs
NoSQL Types
| Type | Data Model | Use Case | Example |
|---|---|---|---|
| Key-Value | Key → Value | Caching, Sessions | Redis, DynamoDB |
| Document | JSON documents | Content management, Catalogs | MongoDB, Couchbase |
| Column-Family | Wide columns | Time-series, Analytics | Cassandra, HBase |
| Graph | Nodes + Edges | Social networks, Recommendations | Neo4j, Amazon Neptune |
Indexing Deep Dive
A database index works like a book’s index: instead of reading every page to find “PostgreSQL,” you look up “P” in the index and jump straight to page 247. Without an index, the database performs a full table scan — reading every single row. On a table with 10 million rows, this is the difference between 5ms and 5 seconds.When NOT to Use Indexes
Indexes are not free — every index slows down writes because the database must update both the table and the index on every INSERT, UPDATE, or DELETE. A table with 10 indexes means 11 writes for every data change. Know when to skip them:- High write tables: Each index adds write overhead. A logging table with millions of inserts/second should have minimal indexes.
- Small tables (under ~1,000 rows): Full table scan may be faster than the overhead of reading the index then jumping to the table.
- Low cardinality columns: A boolean
is_activecolumn has only 2 values — an index barely narrows the search. Exception: if 99% of rows arefalseand you query fortrue, a partial index is useful. - Frequently updated columns: Every update rewrites the index entry. A
last_seen_attimestamp updated on every request is a poor indexing candidate.
Database Normalization
| Normal Form | Rule | Example Violation |
|---|---|---|
| 1NF | No repeating groups, atomic values | phone: "123,456,789" |
| 2NF | No partial dependencies | Non-key depends on part of composite key |
| 3NF | No transitive dependencies | zip → city → state |
| BCNF | Every determinant is a candidate key | More strict 3NF |
Computer Architecture
CPU Cache Hierarchy
Latency Numbers Every Programmer Should Know
| Operation | Time |
|---|---|
| L1 cache reference | 0.5 ns |
| Branch mispredict | 5 ns |
| L2 cache reference | 7 ns |
| Mutex lock/unlock | 25 ns |
| Main memory reference | 100 ns |
| Compress 1KB with Zippy | 3,000 ns (3 μs) |
| Send 1KB over 1 Gbps network | 10,000 ns (10 μs) |
| Read 4KB randomly from SSD | 150,000 ns (150 μs) |
| Round trip within datacenter | 500,000 ns (0.5 ms) |
| Read 1 MB sequentially from SSD | 1,000,000 ns (1 ms) |
| Disk seek | 10,000,000 ns (10 ms) |
| Read 1 MB sequentially from disk | 20,000,000 ns (20 ms) |
| Send packet CA → Netherlands → CA | 150,000,000 ns (150 ms) |
Distributed Systems Basics
CAP Theorem
In a distributed system, you can only guarantee 2 of 3: Consistency, Availability, Partition ToleranceIn practice, network partitions WILL happen (cables get cut, switches fail, cloud regions go down), so you are really choosing between Consistency and Availability during a partition. The question is not “CP or AP?” but rather “when the network splits, should the system refuse requests (consistent but unavailable) or serve potentially stale data (available but inconsistent)?” Banks choose consistency (better to reject a transaction than process it twice). Social media feeds choose availability (showing a slightly stale feed is better than showing an error page).
| Choice | Trade-off | Examples |
|---|---|---|
| CP | Sacrifice availability for consistency | MongoDB, HBase, Redis Cluster |
| AP | Sacrifice consistency for availability | Cassandra, DynamoDB, CouchDB |
| CA | Not practical (network partitions happen) | Traditional RDBMS (single node) |
Consistency Models
Understanding consistency models is critical because they define what “correct” means in your distributed system. Choosing the wrong model leads to bugs that are nearly impossible to reproduce in development but happen constantly in production.| Model | Description | Use Case | Real-World Analogy |
|---|---|---|---|
| Strong | All reads see the latest write, everywhere, immediately | Banking, Inventory counts | A shared whiteboard where everyone instantly sees every change |
| Eventual | Reads will eventually see the latest write (seconds or minutes of lag) | Social media feeds, DNS | A memo distributed to all offices — eventually everyone gets it |
| Causal | Respects cause-effect ordering (if A caused B, everyone sees A before B) | Messaging apps, comments | Reply always appears after the message it replies to |
| Read-your-writes | A user always sees their own writes immediately (others may lag) | User profile updates | You see your own post instantly, friends see it seconds later |
Practice Questions
What happens when you type a URL in a browser?
What happens when you type a URL in a browser?
- Browser checks caches: Browser cache, OS cache, router cache
- DNS resolution: Resolve domain to IP address
- Check local DNS cache
- Query recursive DNS resolver
- Query root → TLD → authoritative nameserver
- TCP connection: 3-way handshake (SYN → SYN-ACK → ACK)
- TLS handshake (if HTTPS):
- Client Hello (supported ciphers)
- Server Hello + Certificate
- Key exchange
- Encrypted connection established
- HTTP request sent: GET / HTTP/1.1 with headers
- Server processes request:
- Load balancer routes to server
- Server processes and queries database
- Generates response
- HTTP response received: HTML content with status code
- Browser renders:
- Parse HTML → DOM tree
- Parse CSS → CSSOM
- Execute JavaScript
- Render pixels to screen
How does a database index work?
How does a database index work?
- Balanced tree with sorted keys
- Each node can have multiple children
- Leaf nodes contain pointers to actual rows
- Instead of scanning all rows O(n)
- Binary search through tree O(log n)
- Follow pointers to find matching rows
- Faster reads (logarithmic lookup)
- Slower writes (must update index)
- Additional storage space
Explain deadlock and how to prevent it
Explain deadlock and how to prevent it
- Mutual exclusion - resource can’t be shared
- Hold and wait - holding one resource, waiting for another
- No preemption - can’t force release
- Circular wait - A waits for B, B waits for A
- Lock ordering: Always acquire locks in same global order
- Lock timeout: Give up after waiting too long
- Deadlock detection: Detect and kill one transaction
- Try-lock: Non-blocking attempt, back off if fails
- Avoid nested locks: Minimize lock scope
Explain the difference between processes and threads
Explain the difference between processes and threads
- Independent execution unit with own memory space
- Isolated - crash doesn’t affect other processes
- Higher creation/switching overhead (~10ms)
- Communication via IPC (pipes, sockets, shared memory)
- Lightweight execution unit within a process
- Shared memory with parent and sibling threads
- Lower creation/switching overhead (~1ms)
- Direct memory sharing (but needs synchronization)
- Processes: Isolation needed, fault tolerance, security
- Threads: Shared state, I/O-bound tasks, lower overhead needed
What is the difference between TCP and UDP?
What is the difference between TCP and UDP?
- Connection-oriented (handshake required)
- Reliable (acknowledgments, retransmission)
- Ordered (sequence numbers)
- Flow control (sliding window)
- Congestion control
- Use: Web, email, file transfer, SSH
- Connectionless (no handshake)
- Unreliable (no guarantees)
- Unordered (packets may arrive out of order)
- No flow/congestion control
- Lower latency, less overhead
- Use: Gaming, video streaming, DNS, VoIP
Explain database transaction isolation levels
Explain database transaction isolation levels
- Dirty read: See uncommitted data
- Non-repeatable read: Same query, different results
- Phantom read: New rows appear matching query
Quick Reference Card
Numbers to Remember
| Metric | Value |
|---|---|
| L1 cache | ~1 ns |
| L2 cache | ~10 ns |
| RAM | ~100 ns |
| SSD random read | ~150 μs |
| HDD seek | ~10 ms |
| Same datacenter RTT | ~0.5 ms |
| Cross-region RTT | ~100-150 ms |
Common Ports
| Port | Service |
|---|---|
| 22 | SSH |
| 80 | HTTP |
| 443 | HTTPS |
| 3306 | MySQL |
| 5432 | PostgreSQL |
| 6379 | Redis |
| 27017 | MongoDB |
Interview Deep-Dive
A production PostgreSQL query that used to return in 20ms is now taking 12 seconds. The table has grown from 500K rows to 50 million. Walk me through how you diagnose and fix this.
A production PostgreSQL query that used to return in 20ms is now taking 12 seconds. The table has grown from 500K rows to 50 million. Walk me through how you diagnose and fix this.
- First, I run EXPLAIN ANALYZE on the slow query to see the actual execution plan. I am looking for a Seq Scan (full table scan) where I expect an Index Scan. A table growing from 500K to 50M means a missing or ineffective index is the most likely cause — a full scan that was “fast enough” on 500K rows becomes catastrophic at 50M.
- Second, I check if the query uses functions on indexed columns. A common trap:
WHERE YEAR(created_at) = 2024cannot use a B-tree index on created_at because the database must evaluate the function on every row. Rewriting it asWHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'lets the B-tree do its job. - Third, I check for composite index order issues. If the query filters on
user_idandstatusbut the composite index is(status, user_id), and most queries only provideuser_id, the index is useless due to the leftmost prefix rule. Column order in composite indexes must match query patterns. - Fourth, I look at table bloat and vacuum status. In PostgreSQL, dead tuples from updates and deletes accumulate if autovacuum is not keeping up. A table with 50M live rows but 200M dead tuples means the database is scanning 250M rows worth of pages. Running
VACUUM ANALYZEreclaims space and updates statistics so the query planner makes better decisions. - Fifth, I check if the query planner’s statistics are stale. After a large data load, the planner may estimate 1,000 rows when the actual result is 10 million, causing it to choose a nested loop join instead of a hash join. Running
ANALYZEon the table updates the statistics.
Explain the CAP theorem. A junior engineer says they want to build a system that is both strongly consistent and highly available. What do you tell them?
Explain the CAP theorem. A junior engineer says they want to build a system that is both strongly consistent and highly available. What do you tell them?
- The CAP theorem states that in a distributed system, during a network partition, you must choose between consistency (every read returns the most recent write) and availability (every request receives a response). You cannot have both simultaneously when the network is partitioned.
- The key nuance that most people miss: CAP only applies during a partition. When the network is healthy, you can have both consistency and availability. The real question is: “When the network inevitably partitions, which property do you sacrifice?”
- What I would tell the junior engineer: “You can absolutely have strong consistency and high availability — as long as you run on a single node. The moment you distribute across nodes for fault tolerance, network partitions become inevitable, and you must choose.” Then I would ask them what their system does. If it is a banking ledger, they need consistency (CP) — better to reject a transaction than process it twice. If it is a social media feed, they need availability (AP) — showing a slightly stale feed is better than showing an error page.
- In practice, most modern systems do not make a binary CP/AP choice. They use tunable consistency — strong consistency for critical paths (checkout, payment) and eventual consistency for less critical paths (recommendation updates, analytics). DynamoDB lets you choose per-read: strongly consistent reads cost 2x the capacity but guarantee freshness. Most teams use eventually consistent reads for 95% of operations and pay for strong reads only where correctness demands it.
- I would also mention that the PACELC theorem extends CAP: even when the network is operating normally (no Partition), you still face a trade-off between Latency and Consistency. Strong consistency requires coordination (consensus protocols, quorum writes), which adds latency. This is the real everyday trade-off — partitions are rare, but the latency-consistency trade-off affects every single request.
You have a multi-threaded application where two threads occasionally produce incorrect results, but the bug only appears under heavy load. How do you approach debugging this?
You have a multi-threaded application where two threads occasionally produce incorrect results, but the bug only appears under heavy load. How do you approach debugging this?
- This is almost certainly a race condition. The symptoms — intermittent, load-dependent, non-reproducible in isolation — are the classic fingerprint. Under heavy load, the thread scheduler runs faster context switches, which makes the timing window for the race condition more likely to be hit.
- Step 1: I would identify the shared mutable state. Race conditions require shared state that at least one thread is writing. I would audit the code for global variables, class-level variables, shared caches, or mutable objects passed between threads. In Python, even a simple
counter += 1is not atomic — it is a read-modify-write sequence that can be interleaved. - Step 2: I would add thread-safety annotations or use a race condition detector. For Java, ThreadSanitizer (TSan) can instrument the code to detect data races at runtime. For Python, I would use the
threadingmodule’s logging to trace lock acquisition order and identify unprotected critical sections. - Step 3: If the race condition is hard to reproduce, I would add deliberate delays (strategic
time.sleep(0.001)calls) around suspected critical sections to widen the timing window and make the bug more reproducible. This is counterintuitive — making the code slower to make the bug more frequent — but it is an effective technique. - Step 4: Fix by protecting the critical section with the appropriate synchronization primitive. A Lock/Mutex for simple mutual exclusion. A ReadWriteLock if reads vastly outnumber writes and you want to allow concurrent reads. An atomic operation or CAS (Compare-And-Swap) if the shared state is a single value and you want lock-free performance.
- Step 5: Verify the fix under the same heavy load conditions that triggered the original bug. Run the load test for 10x longer than the typical reproduction time to increase confidence. Race conditions are probabilistic — a short test passing does not prove the fix works.
jstack <pid> or kill -3 <pid>) which explicitly shows deadlocks — it will say “Found one Java-level deadlock” and show the lock cycle. In Python, I would use faulthandler.dump_traceback() or attach a debugger. The thread dump shows which locks each thread holds and which lock it is waiting for, revealing the circular dependency. The fix depends on the cause: if two threads acquire locks A and B in opposite order, I enforce a global lock ordering — every thread acquires A before B, everywhere in the codebase. This breaks the circular-wait Coffman condition. If lock ordering is impractical (locks are dynamically determined), I switch to try-lock with timeout: attempt to acquire the lock for N seconds, and if it fails, release all held locks, back off with a randomized delay, and retry. This breaks the no-preemption condition.Explain how TCP guarantees reliable delivery. Then tell me a scenario where you would choose UDP despite its unreliability.
Explain how TCP guarantees reliable delivery. Then tell me a scenario where you would choose UDP despite its unreliability.
- TCP guarantees reliability through four mechanisms working together: (1) Sequence numbers — every byte in the stream gets a sequence number, so the receiver can detect missing or out-of-order data and reassemble it correctly. (2) Acknowledgments — the receiver sends ACKs confirming which bytes it has received. If the sender does not receive an ACK within a timeout, it retransmits the data. (3) Checksums — every segment includes a checksum that detects corruption in transit. Corrupted segments are silently discarded and the retransmission mechanism handles recovery. (4) Flow control via the sliding window — the receiver advertises how much buffer space it has, preventing the sender from overwhelming a slow receiver.
- TCP also handles congestion control: algorithms like TCP Reno, CUBIC, and BBR detect network congestion (packet loss, increasing RTT) and reduce the sending rate to avoid making the congestion worse. This is why a large file download starts slow and speeds up — TCP is probing the network capacity.
- I would choose UDP for real-time multiplayer gaming. In a game running at 60 FPS, you send player position updates 60 times per second. If packet 45 is lost, by the time TCP retransmits it, packets 46-50 have already arrived with newer position data. The retransmitted packet 45 is stale and useless — the player has moved since then. UDP lets you simply drop the lost packet and process the next one. You implement your own “good enough” reliability: you send the current state with every packet so any single packet contains enough information to reconstruct the game state.
- Other strong UDP use cases: live video streaming (a dropped frame is better than a frozen stream), DNS lookups (a single request-response that fits in one packet — TCP handshake overhead is wasteful), and VoIP (a brief audio glitch is better than a pause while TCP retransmits). In all these cases, the application can tolerate data loss but cannot tolerate the latency that TCP’s retransmission adds.