Cross-Cutting Concerns
For every topic in this guide, consider these dimensions. They are the lens through which senior engineers evaluate every technical decision. Interviewers expect you to raise these proactively, not wait to be asked.Trade-offs
Trade-offs
Scale
Scale
Security
Security
Performance
Performance
Observability
Observability
Logging
Logging
Error Handling
Error Handling
Monitoring & Alerting
Monitoring & Alerting
Configuration Management
Configuration Management
Testing
Testing
Failure Modes
Failure Modes
Cost
Cost
Maintainability
Maintainability
Rollout Strategy
Rollout Strategy
Backward Compatibility
Backward Compatibility
User & Business Impact
User & Business Impact
What Interviewers Are Really Testing
When you face a technical question in a senior engineering interview, the question itself is rarely the point. Here is what they are actually evaluating:| When asked about… | They are testing whether you… |
|---|---|
| CAP theorem | Understand that architecture is about trade-offs, not “best practices” |
| Microservices | Can identify when NOT to use them — not just the benefits |
| Caching | Understand the consistency implications, not just the performance boost |
| Database choice (SQL vs NoSQL) | Can reason about data access patterns rather than following trends |
| System design (URL shortener, etc.) | Can structure ambiguity, ask the right clarifying questions, and prioritize |
| Scaling | Know when NOT to over-engineer — start simple, scale when needed |
| Authentication | Understand security trade-offs, not just which library to use |
| Concurrency | Can identify race conditions and reason about shared state |
| Testing strategy | Understand the cost-benefit of different test types, not just “test everything” |
| Incident response | Stay calm, prioritize mitigation over root cause, and communicate clearly |
| Technical debt | Can quantify business impact and make strategic priority arguments |
Good vs Bad Answers: What Interviewers Hear
CAP Theorem
CAP Theorem
Microservices
Microservices
Caching
Caching
Database Choice
Database Choice
System Design
System Design
Scaling
Scaling
Testing Strategy
Testing Strategy
Incident Response
Incident Response
Technical Debt
Technical Debt
Common Misconceptions That Trip Senior Engineers
These are beliefs that many engineers hold but that will get you corrected in a senior-level interview or architecture review.NoSQL is faster than SQL
NoSQL is faster than SQL
Microservices are always better than monoliths
Microservices are always better than monoliths
Adding more caching always helps
Adding more caching always helps
Kubernetes is required for containers
Kubernetes is required for containers
Eventual consistency means data might never converge
Eventual consistency means data might never converge
REST means using JSON over HTTP
REST means using JSON over HTTP
Horizontal scaling is always better than vertical
Horizontal scaling is always better than vertical
100% test coverage means no bugs
100% test coverage means no bugs
Premature optimization is always bad
Premature optimization is always bad
Essential Reading List
Curated resources for senior engineers preparing for interviews and leveling up their craft. Books are organized by category with difficulty levels and a note on why each one matters.Fundamentals
| Book | Author(s) | Level | Why Read This |
|---|---|---|---|
| Designing Data-Intensive Applications | Martin Kleppmann | Intermediate | The single best book for understanding distributed systems, databases, and data pipelines — essential for any system design interview. Companion: Martin Kleppmann’s talks on YouTube cover the same topics in lecture format and are freely available. Free alternative: Kleppmann’s lecture series at Cambridge provides the distributed systems foundations in a structured course format. |
| Site Reliability Engineering | Intermediate | Defines how Google runs production systems; foundational for understanding reliability, monitoring, and incident response. Free: The full book is available free online at sre.google. | |
| The Site Reliability Workbook | Intermediate | Practical companion to the SRE book with actionable exercises and real-world case studies. Free: Also available free online at sre.google. | |
| Clean Code | Robert C. Martin | Beginner | Establishes baseline code quality principles that every engineer should internalize early in their career. Free alternative: Google’s Engineering Practices documentation covers many of the same code quality principles in a concise, freely available format. |
| A Philosophy of Software Design | John Ousterhout | Beginner | Short, opinionated guide to managing complexity — the single most important skill in software engineering. Free alternative: John Ousterhout’s Stanford lecture on the topic covers the key ideas in a single talk. |
Architecture & Design
| Book | Author(s) | Level | Why Read This |
|---|---|---|---|
| Building Microservices | Sam Newman | Intermediate | The definitive guide to microservices — including when not to use them, which is equally important. Free alternative: Sam Newman’s talks at conferences distill the key ideas into digestible presentations. |
| Microservices Patterns | Chris Richardson | Intermediate | Pattern catalog for solving common distributed systems problems: sagas, CQRS, event sourcing |
| Domain-Driven Design | Eric Evans | Advanced | The foundational text on modeling complex business domains; dense but transformative for how you think about system boundaries |
| Fundamentals of Software Architecture | Mark Richards, Neal Ford | Intermediate | Broad survey of architecture styles and decision-making frameworks — great for building architectural vocabulary |
| Release It! | Michael Nygard | Intermediate | Practical patterns for building production-ready systems: circuit breakers, bulkheads, timeouts, and stability patterns. Free alternative: Michael Nygard’s blog posts and conference talks cover many of the same resilience patterns with real-world examples. |
| Software Architecture: The Hard Parts | Neal Ford et al. | Advanced | Tackles the genuinely difficult architectural decisions with trade-off analysis frameworks |
Scalability & Systems
| Book | Author(s) | Level | Why Read This |
|---|---|---|---|
| The Art of Scalability | Abbott & Fisher | Intermediate | Introduces the AKF Scale Cube and systematic approaches to scaling organizations and technology together |
| Understanding Distributed Systems | Roberto Vitillo | Beginner | The most accessible introduction to distributed systems concepts — read this before Kleppmann if you are new to the topic. Free alternative: MIT 6.824 Distributed Systems lecture videos provide a rigorous, freely available foundation in distributed systems. |
| System Design Interview Vol 1 & 2 | Alex Xu | Beginner | Step-by-step walkthroughs of common system design problems; excellent for interview preparation specifically |
| Web Scalability for Startup Engineers | Artur Ejsmont | Beginner | Practical scalability guide tailored for engineers at growing startups who need to scale incrementally |
Observability & Operations
| Book | Author(s) | Level | Why Read This |
|---|---|---|---|
| Observability Engineering | Charity Majors et al. | Intermediate | Reframes monitoring as observability and teaches you how to ask questions of your production systems you did not anticipate. Free alternative: Charity Majors’ blog and her conference talks cover the core observability philosophy and are excellent standalone resources. |
| High Performance Browser Networking | Ilya Grigorik | Intermediate | Deep dive into networking fundamentals every web engineer needs: TCP, TLS, HTTP/2, WebSockets, and performance optimization. Free: The entire book is available free online at hpbn.co. |
| Systems Performance | Brendan Gregg | Advanced | The definitive guide to Linux performance analysis; essential for anyone debugging production performance issues. Free alternative: Brendan Gregg’s blog and his Linux Performance Tools talk are freely available and cover the core performance analysis methodologies. |
Delivery & Engineering Culture
| Book | Author(s) | Level | Why Read This |
|---|---|---|---|
| Accelerate | Nicole Forsgren, Jez Humble, Gene Kim | Beginner | Research-backed evidence for what actually makes engineering teams high-performing — the DORA metrics originate here |
| The Phoenix Project | Gene Kim, Kevin Behr, George Spafford | Beginner | A novel that makes DevOps principles visceral and memorable; read this to understand why continuous delivery matters. Companion: The DevOps Handbook by Gene Kim et al. turns the narrative lessons into actionable practices — read Phoenix Project for the “why,” then DevOps Handbook for the “how.” |
| Continuous Delivery | Jez Humble, David Farley | Intermediate | The foundational text on deployment pipelines, automated testing, and releasing software safely and frequently |
| The Staff Engineer’s Path | Tanya Reilly | Intermediate | Practical guide for engineers moving beyond senior into staff-plus roles — covers technical leadership, influence, and scope |
| Staff Engineer | Will Larson | Intermediate | Explores the archetypes and operating modes of staff engineers through stories and frameworks for navigating the role |
| An Elegant Puzzle | Will Larson | Intermediate | Systems thinking applied to engineering management; valuable for senior engineers who want to understand organizational dynamics |
| Team Topologies | Matthew Skelton, Manuel Pais | Intermediate | Explains how team structure shapes software architecture (Conway’s Law made actionable) and how to design teams for fast flow |
Data Engineering
| Book | Author(s) | Level | Why Read This |
|---|---|---|---|
| Fundamentals of Data Engineering | Joe Reis, Matt Housley | Intermediate | Comprehensive overview of the data engineering lifecycle: ingestion, storage, transformation, and serving |
Interview Preparation
| Resource | Type | Level | Why Use This |
|---|---|---|---|
| Grokking the System Design Interview | Course | Beginner | Structured walkthroughs of the most commonly asked system design problems with clear frameworks |
| NeetCode.io | Practice | Beginner | Curated coding problems organized by pattern — the most efficient path through LeetCode-style preparation |
| Tech Interview Handbook | Guide | Beginner | Comprehensive free guide covering resume writing, behavioral questions, negotiation, and technical preparation |
| Google’s Engineering Practices — Code Review | Guide | Intermediate | Learn how Google approaches code review; useful for both giving and receiving feedback in interview code review exercises |
Tool Reference Index
A categorized reference of tools commonly discussed in senior engineering interviews and architecture discussions.Observability
APM & Distributed Tracing
APM & Distributed Tracing
| Tool | Description |
|---|---|
| Datadog | Full-stack observability platform with APM, logs, and infrastructure monitoring in a single pane |
| New Relic | Application performance monitoring with deep code-level visibility and error tracking |
| Dynatrace | AI-powered observability with automatic dependency mapping and root cause analysis |
| Jaeger | Open-source distributed tracing system, originally built by Uber, CNCF graduated project |
| Zipkin | Open-source distributed tracing system, originally built by Twitter, lightweight alternative to Jaeger |
| Azure Application Insights | Microsoft’s APM service, tightly integrated with Azure services and .NET applications |
| AWS X-Ray | AWS-native distributed tracing for applications running on AWS infrastructure |
| Honeycomb | Observability platform built around high-cardinality, high-dimensionality event data exploration |
| OpenTelemetry | Vendor-neutral open standard for instrumentation — the emerging industry standard for telemetry data collection |
Metrics & Monitoring
Metrics & Monitoring
| Tool | Description |
|---|---|
| Prometheus | Open-source metrics collection and alerting toolkit; the de facto standard for Kubernetes monitoring |
| Grafana | Open-source visualization and dashboarding platform; pairs with Prometheus, InfluxDB, and many data sources |
| InfluxDB | Purpose-built time-series database optimized for high-write-throughput metrics storage |
| StatsD | Lightweight daemon for aggregating and summarizing application metrics before shipping to backends |
| Graphite | Veteran time-series database and graphing system; still widely used for infrastructure metrics |
| CloudWatch | AWS-native monitoring service for AWS resources and custom application metrics |
| Azure Monitor | Microsoft’s comprehensive monitoring service for Azure infrastructure and applications |
Logging
Logging
| Tool | Description |
|---|---|
| ELK Stack | Elasticsearch + Logstash + Kibana — the classic open-source log aggregation and search stack |
| Grafana Loki | Log aggregation system designed for cost efficiency; indexes labels, not full text, unlike Elasticsearch |
| Splunk | Enterprise log analytics platform with powerful search and machine learning capabilities |
| Datadog Logs | Log management integrated with Datadog’s APM and infrastructure monitoring |
| Fluentd | Open-source unified logging layer for collecting and routing logs from diverse sources (CNCF graduated) |
| Fluent Bit | Lightweight log processor and forwarder; ideal for resource-constrained environments and edge computing |
Incident Management
Incident Management
| Tool | Description |
|---|---|
| PagerDuty | Incident management platform with intelligent alerting, escalation policies, and on-call scheduling |
| Opsgenie | Alert management and on-call scheduling by Atlassian; integrates tightly with Jira and Confluence |
| Statuspage | Public and internal status page hosting for communicating incidents to users and stakeholders |
CI/CD & Delivery
CI/CD Pipelines
CI/CD Pipelines
| Tool | Description |
|---|---|
| GitHub Actions | CI/CD built into GitHub with YAML-based workflows; the most popular choice for open-source projects |
| GitLab CI | Integrated CI/CD within GitLab with powerful pipeline visualization and environment management |
| Jenkins | The original open-source automation server; extremely flexible but requires significant maintenance |
| CircleCI | Cloud-native CI/CD with fast build times, Docker-layer caching, and parallelism support |
| ArgoCD | Declarative GitOps continuous delivery tool for Kubernetes; syncs cluster state to Git repositories |
| Flux | GitOps toolkit for Kubernetes; CNCF graduated project for keeping clusters in sync with Git |
Feature Flags
Feature Flags
| Tool | Description |
|---|---|
| LaunchDarkly | Enterprise feature management platform with targeting, experimentation, and audit trails |
| Unleash | Open-source feature flag system with a self-hosted option and a solid community edition |
| Flagsmith | Open-source feature flag and remote config service with an intuitive UI |
| Flipt | Open-source, self-hosted feature flag solution built in Go; lightweight and simple to operate |
Databases
Databases
Databases
| Tool | Type | Description |
|---|---|---|
| PostgreSQL | Relational | The most advanced open-source relational database; excels at complex queries, ACID compliance, and extensibility |
| MySQL | Relational | Widely adopted relational database; known for read-heavy workloads and ease of replication |
| MongoDB | Document | Document-oriented NoSQL database; flexible schema, good for rapid prototyping and document-shaped data |
| DynamoDB | Key-Value / Document | AWS-managed NoSQL database with single-digit millisecond performance at any scale; pay-per-request pricing |
| Cassandra | Wide-Column | Distributed NoSQL database designed for high write throughput across multiple data centers |
| CockroachDB | Distributed SQL | Distributed SQL database with strong consistency and horizontal scaling; PostgreSQL-compatible wire protocol |
| Cloud Spanner | Distributed SQL | Google’s globally distributed relational database with strong consistency and 99.999% availability SLA |
| Redis | In-Memory | In-memory data structure store used as cache, message broker, and primary database for specific use cases |
| Elasticsearch | Search / Analytics | Distributed search and analytics engine; excels at full-text search, log analytics, and real-time data exploration |
Database Migrations
Database Migrations
| Tool | Description |
|---|---|
| Flyway | Version-based migration tool for JVM applications; simple SQL-based migrations |
| Liquibase | Database-agnostic schema change management with XML, YAML, JSON, or SQL changelogs |
| Alembic | Migration tool for SQLAlchemy (Python); generates migrations from model changes |
| Knex | Query builder and migration tool for Node.js applications |
| EF Migrations | Entity Framework migrations for .NET; code-first schema management |
| golang-migrate | Database migration tool written in Go; supports CLI and library usage |
| dbmate | Lightweight, framework-agnostic migration tool supporting multiple database engines |
Messaging & Streaming
Message Brokers & Event Streaming
Message Brokers & Event Streaming
| Tool | Description |
|---|---|
| Kafka | Distributed event streaming platform for high-throughput, fault-tolerant, real-time data pipelines |
| RabbitMQ | Feature-rich message broker supporting multiple protocols (AMQP, MQTT, STOMP); excellent for task queues |
| AWS SQS/SNS | Managed message queue (SQS) and pub/sub (SNS) services; zero operational overhead for AWS-native architectures |
| Azure Service Bus | Enterprise message broker with advanced features: sessions, dead-lettering, scheduled delivery |
| Google Pub/Sub | Global-scale messaging service with at-least-once delivery and exactly-once processing support |
| NATS | Lightweight, high-performance messaging system designed for cloud-native and edge computing |
| Redis Streams | Append-only log data structure in Redis for lightweight event streaming without a dedicated broker |
Infrastructure
Infrastructure as Code
Infrastructure as Code
| Tool | Description |
|---|---|
| Terraform | The industry standard for multi-cloud infrastructure as code using declarative HCL configuration |
| Pulumi | Infrastructure as code using general-purpose programming languages (TypeScript, Python, Go, C#) |
| CloudFormation | AWS-native infrastructure as code service; deep integration with all AWS services |
| Bicep | Domain-specific language for deploying Azure resources; cleaner syntax than ARM templates |
| Ansible | Agentless configuration management and automation tool using YAML playbooks over SSH |
Containers & Orchestration
Containers & Orchestration
| Tool | Description |
|---|---|
| Docker | The standard for containerization; packages applications with their dependencies into portable images |
| Kubernetes | Container orchestration platform for automating deployment, scaling, and management of containerized applications |
| Helm | Package manager for Kubernetes; bundles related manifests into reusable, versioned charts |
Service Mesh
Service Mesh
| Tool | Description |
|---|---|
| Istio | Feature-rich service mesh providing traffic management, security, and observability for Kubernetes workloads |
| Linkerd | Lightweight, security-focused service mesh designed for simplicity and low resource overhead |
API Gateways
API Gateways
| Tool | Description |
|---|---|
| Kong | Open-source API gateway and microservices management layer with a rich plugin ecosystem |
| Ambassador | Kubernetes-native API gateway built on Envoy proxy for managing edge and service traffic |
| AWS API Gateway | Managed API gateway for creating, publishing, and securing APIs at any scale on AWS |
| Azure API Management | Full-lifecycle API management platform with developer portal, analytics, and policy enforcement |
Security
Security Scanning
Security Scanning
| Tool | Description |
|---|---|
| OWASP ZAP | Open-source web application security scanner for finding vulnerabilities during development and testing |
| Burp Suite | Professional web security testing toolkit with intercepting proxy and automated scanning |
| Snyk | Developer-first security platform for finding and fixing vulnerabilities in code, dependencies, and containers |
| Dependabot | GitHub-native automated dependency updates with security vulnerability alerts |
| Trivy | Comprehensive open-source vulnerability scanner for containers, filesystems, and Git repositories |
| SonarQube | Code quality and security analysis platform with rules for bugs, vulnerabilities, and code smells |
Secrets Management
Secrets Management
| Tool | Description |
|---|---|
| HashiCorp Vault | Industry-standard secrets management with dynamic secrets, encryption as a service, and identity-based access |
| AWS Secrets Manager | AWS-managed secrets storage with automatic rotation and fine-grained IAM access control |
| Azure Key Vault | Azure-managed service for securely storing keys, secrets, and certificates |
| GCP Secret Manager | Google Cloud’s managed secrets storage with automatic replication and IAM-based access |
| Doppler | Universal secrets manager that syncs secrets across environments, CI/CD, and cloud platforms |
Authorization
Authorization
Testing
Load Testing
Load Testing
| Tool | Description |
|---|---|
| k6 | Modern load testing tool using JavaScript scripts; developer-friendly with excellent CI/CD integration |
| JMeter | Apache’s mature load testing tool with a GUI for designing test plans; supports many protocols |
| Gatling | Scala-based load testing tool with detailed HTML reports and a powerful DSL for test scenarios |
| Locust | Python-based load testing framework where you define user behavior in code; easy to distribute |
| Artillery | Node.js load testing toolkit with YAML-based test definitions and cloud-native distributed testing |
Unit Testing
Unit Testing
| Tool | Description |
|---|---|
| Jest | JavaScript/TypeScript testing framework with built-in mocking, coverage, and snapshot testing |
| pytest | Python’s most popular testing framework; powerful fixtures, parametrization, and plugin ecosystem |
| JUnit | The standard unit testing framework for Java applications |
| xUnit | Modern testing framework for .NET with a clean architecture and parallel test execution |
| Go testing | Go’s built-in testing package with benchmarking and fuzzing support |
| RSpec | Behavior-driven testing framework for Ruby with expressive, readable test syntax |
Integration & E2E Testing
Integration & E2E Testing
| Tool | Description |
|---|---|
| Testcontainers | Library for spinning up real Docker containers (databases, brokers) for integration tests |
| WireMock | HTTP API mock server for simulating external service dependencies in tests |
| LocalStack | Local AWS cloud emulator for testing AWS integrations without real AWS resources |
| Azurite | Local Azure Storage emulator for testing Blob, Queue, and Table storage operations |
| Playwright | Microsoft’s browser automation framework for reliable cross-browser E2E testing |
| Cypress | JavaScript E2E testing framework with time-travel debugging and automatic waiting |
| Selenium | The original browser automation tool; supports multiple languages and browsers |
Contract Testing
Contract Testing
| Tool | Description |
|---|---|
| Pact | Consumer-driven contract testing framework ensuring API compatibility between services |
| Spring Cloud Contract | Contract testing for Spring/JVM services with auto-generated stubs and tests |
Mocking
Mocking
| Tool | Description |
|---|---|
| Mockito | The most popular mocking framework for Java; clean API for creating mocks and verifying interactions |
| Moq | .NET mocking library with a fluent API for setting up mock behavior and assertions |
| NSubstitute | .NET mocking library focused on simplicity and natural syntax |
| unittest.mock | Python’s built-in mocking library; part of the standard library, no additional dependencies |
| Sinon.js | JavaScript test spies, stubs, and mocks; works with any testing framework |
| testify/mock | Go mocking package from the testify suite; widely used for Go unit testing |
Chaos Engineering
Chaos Engineering
| Tool | Description |
|---|---|
| Chaos Monkey | Netflix’s tool for randomly terminating production instances to test system resilience |
| Gremlin | Enterprise chaos engineering platform with controlled failure injection experiments |
| Litmus | Open-source chaos engineering framework for Kubernetes with a library of pre-built experiments |
Resilience Libraries
Resilience Libraries
| Tool | Description |
|---|---|
| Polly | .NET resilience library with retry, circuit breaker, timeout, bulkhead, and fallback policies |
| Resilience4j | Lightweight fault-tolerance library for JVM applications inspired by Netflix Hystrix |
| cockatiel | Node.js resilience library with retry, circuit breaker, timeout, and bulkhead patterns |
Podcasts & Blogs
Engineering blogs and podcasts from teams solving problems at scale. These are invaluable for staying current with real-world architecture decisions and operational lessons.Engineering Blogs
| Blog | Focus | Why Follow |
|---|---|---|
| Netflix Tech Blog | Distributed systems, streaming, microservices | Pioneered chaos engineering, circuit breakers, and many patterns now considered industry standard |
| Uber Engineering | Real-time systems, data platforms, infrastructure | Deep dives into problems at massive scale: geospatial indexing, real-time pricing, multi-region architecture |
| Stripe Engineering | API design, payments, reliability | Excellent writing on API design philosophy, idempotency, and building systems where correctness is non-negotiable |
| Meta Engineering | Infrastructure, AI/ML, developer tools | Insights from operating services for billions of users: caching at scale, social graph, and content delivery |
| Google Research Blog | Distributed systems, ML, infrastructure | Original papers and posts on technologies that shaped the industry: MapReduce, Spanner, Borg |
| AWS Architecture Blog | Cloud architecture, well-architected patterns | Reference architectures and best practices for building on AWS; excellent for system design preparation |
| Cloudflare Blog | Networking, security, edge computing | Exceptionally well-written posts on networking internals, DDoS mitigation, and edge computing |
| LinkedIn Engineering | Data infrastructure, search, real-time processing | Originators of Kafka; excellent posts on data pipelines, search ranking, and large-scale service architectures |
| Shopify Engineering | Monolith architecture, scaling Ruby, platform | Rare perspective on scaling a massive Rails monolith; counterpoint to the microservices-first narrative |
| GitHub Engineering | Developer tools, Git internals, reliability | Insights into running one of the world’s largest Git hosting platforms and improving developer experience |
| Martin Fowler’s Blog | Architecture, patterns, agile practices | Thoughtful, evergreen writing on software architecture concepts, refactoring, and design patterns |
Podcasts
| Podcast | Focus | Why Listen |
|---|---|---|
| Software Engineering Daily | Broad software engineering | Daily interviews with engineers building real systems; covers infrastructure, data, AI, and more |
| The Pragmatic Engineer | Senior engineering career, industry trends | Gergely Orosz’s newsletter and podcast covering how big tech actually works; essential for career growth |
| CoRecursive | Software engineering stories | Deep, narrative-driven episodes exploring the stories behind significant software projects |
| Engineering Enablement | Developer productivity, platform engineering | Focuses on how to measure and improve engineering team effectiveness |
| Ship It! | Infrastructure, operations, deployment | Practical conversations about how teams ship and operate software in production |
| The Changelog | Open source, software development | Long-running podcast covering the people, projects, and practices shaping the software industry; excellent for broadening your engineering perspective |
YouTube Channels
| Channel | Focus | Why Watch |
|---|---|---|
| ByteByteGo | System design | Alex Xu’s visual system design explanations brought to life in video format; the best YouTube channel for system design interview preparation |
| Systems Design Fight Club | System design debates | Engineers debate architectural trade-offs in real-time, exposing the messiness of real design decisions that textbooks gloss over |
Individual Blogs
These are personal blogs by engineers whose writing consistently provides deep, original insight. Unlike company engineering blogs, these represent individual perspectives shaped by years of hands-on experience.| Blog | Author | Focus | Why Read |
|---|---|---|---|
| Irrational Exuberance | Will Larson | Engineering leadership, systems | The companion blog to his books (Staff Engineer, An Elegant Puzzle); covers engineering strategy, organizational design, and the mechanics of technical leadership with unusual clarity |
| danluu.com | Dan Luu | Systems, performance, industry analysis | Rigorous, data-driven posts that challenge conventional wisdom. His posts on hardware latency numbers, developer productivity, and tech industry practices are widely cited |
| Jessie Frazelle’s Blog | Jessie Frazelle | Containers, infrastructure, security | Deep technical posts on Linux containers, kernel security, and infrastructure from a former Docker and Google engineer who shaped the container ecosystem |
| Murat Demirbas’ Blog | Murat Demirbas | Distributed systems | Academic-yet-accessible paper reviews and commentary on distributed systems. Essential reading for anyone who wants to understand the theory behind systems like Raft, Paxos, and CRDTs |
| Charity Majors’ Blog | Charity Majors | Observability, engineering culture | Candid, opinionated posts on observability, debugging production systems, and engineering management from the co-founder of Honeycomb |
Newsletters
| Newsletter | Focus | Why Subscribe |
|---|---|---|
| The Pragmatic Engineer | Big tech, career, engineering culture | The most respected engineering newsletter; covers industry trends, compensation, and technical deep dives |
| ByteByteGo | System design | Visual explanations of system design concepts; excellent companion for interview preparation |
| TLDR | Tech news digest | Curated daily summary of the most important tech news, keeping you current without the noise |
| Pointer | Engineering leadership | Curated reading list for engineering leaders; surfaces the best technical blog posts each week |
This course is a living document. It grows as engineering grows. Contribute, share, and build on it. Think Like an Engineer — A Dev Weekends Course