Database Services
This chapter will teach you everything about databases in Azure, starting from absolute basics. We’ll explain what databases are, why you need them, and how to choose and use the right database service for your needs.What You’ll Learn
By the end of this chapter, you’ll understand:- What a database is and why applications need them (from scratch)
- The difference between relational and NoSQL databases
- When to use Azure SQL vs Cosmos DB vs PostgreSQL/MySQL
- How to design database schemas and optimize queries
- Backup, replication, and disaster recovery strategies
- Cost optimization techniques for databases
- Security best practices (encryption, access control, auditing)
What is a Database? (Start Here if You’re New)
Let’s start with the absolute basics.The Simple Explanation
Database = Organized storage for structured data A database is like a super-powered Excel spreadsheet that:- Stores LOTS of data (millions of rows, not just thousands)
- Allows FAST searching (find a specific customer in milliseconds)
- Supports MULTIPLE users at once (100s of people reading/writing simultaneously)
- Ensures DATA INTEGRITY (prevents conflicts, ensures consistency)
Why Not Just Use Files?
Without a database (storing data in text files or JSON files):Real-World Example: E-Commerce Store
What needs to be stored:Types of Data and Their Databases
Different applications have different data needs: Structured Data (Tables, Rows, Columns)Database vs Storage: What’s the Difference?
This confuses many beginners. Let’s clarify: Storage (Blob Storage, Files)- Stores FILES (images, videos, PDFs, backups)
- No searching (must download entire file)
- No relationships
- Cheap ($1-20/TB/month)
- Use for: Images, videos, backups, logs
- Stores STRUCTURED DATA (tables, documents)
- Fast searching (find specific record instantly)
- Relationships between data
- More expensive ($30-500+/month)
- Use for: Customer records, orders, inventory, user accounts
Understanding Relational vs NoSQL Databases
The first major decision: relational or NoSQL?Relational Databases (SQL)
What it is: Data stored in tables with rows and columns, connected by relationships. Real-World Analogy: Filing Cabinet with Cross-References- Azure SQL Database (Microsoft SQL Server)
- Azure Database for PostgreSQL
- Azure Database for MySQL
- Azure Database for MariaDB
NoSQL Databases (Document, Key-Value, Graph)
What it is: Flexible data storage without fixed schemas. Real-World Analogy: Boxes with Labels- Cosmos DB (multi-model: document, key-value, graph, column)
Comparison Table
| Feature | Relational (SQL) | NoSQL (Cosmos DB) |
|---|---|---|
| Schema | Fixed (must define columns) | Flexible (can change anytime) |
| Data Structure | Tables with rows/columns | JSON documents |
| Relationships | Foreign keys, joins | Embedded documents |
| Scaling | Vertical (bigger server) | Horizontal (more servers) |
| Transactions | ACID (strong consistency) | Eventually consistent (configurable) |
| Query Language | SQL (standard) | SQL-like or API-specific |
| Best For | Structured data, complex queries | Flexible data, massive scale |
| Cost | $30-500+/month | $24-1000+/month |
The Hybrid Approach (What Most Apps Actually Do)
Most real-world applications use BOTH:Under the Hood: The Distributed SQL Problem
To a Principal Engineer, the biggest challenge with databases is Scale vs. Consistency.1. Scaling the “Unscalable” (Sharding)
Traditional SQL databases like SQL Server or PostgreSQL are designed to run on a single machine. When you hit the limit of that machine (even if it’s 128 vCPUs), you have to Shard.- The Problem: You split your “Customers” into two databases: A-M in DB1, N-Z in DB2.
- The Principal’s Headache: How do you do a
JOINbetween a customer in DB1 and an order in DB2? You can’t. Your application code now has to handle the routing and the distributed logic.
2. Read Replicas vs. Write Bottlenecks
You can add hundreds of Read Replicas to handle users browsing your site. But every Write (an order) must go to the Single Primary. This is the bottleneck for every relational database in the cloud.3. The CAP Theorem & Databases
- Azure SQL: Choose C (Consistency). If the primary fails, the database is unavailable for a few seconds until a new primary is elected.
- Cosmos DB: Choose A (Availability). It can be configured for “Multi-Master” writes, allowing you to write to any region, even if other regions are down.
[!IMPORTANT] Pro Insight: NewSQL Azure SQL Hyperscale is Microsoft’s answer to the “Distributed SQL” problem. It separates Compute from Storage, allowing the storage to grow to 100 TB while providing multiple read-compute nodes, effectively blurring the line between traditional SQL and cloud-scale distribution.
1. Database Decision Tree
2. Azure SQL Database
Azure SQL Database is a fully managed relational database with built-in intelligence.Purchasing Models
- DTU Model
- vCore Model
Database Transaction Units (simpler)When to use: Simple workloads, predictable performance
[!WARNING] Gotcha: DTU vs vCore DTU is cheaper for small databases but scales poorly. vCore is more expensive but allows you to save money with “Reserved Instances” (1-3 year commitment). Moving from DTU to vCore is easy, but going back can be tricky.
[!TIP] Jargon Alert: ACID vs BASE ACID (Azure SQL): Data is always correct immediately (Bank transactions). BASE (Cosmos DB): Data is eventually correct (Social media feed). Cosmos DB creates copies around the world, so it might take a few milliseconds for a “Like” to show up in Japan if it happened in Brazil.
High Availability Options
General Purpose
Standard availability
Business Critical
Always On availability
Geo-Replication
Performance Tuning
1. Query Performance Insights
1. Query Performance Insights
2. Indexes
2. Indexes
The Principal’s Guide to Indexing
To a pro, indexes aren’t just “go-faster stripes”. They are physical data structures (B-Trees) that occupy disk space.- The Covered Index: The
INCLUDEclause in the example above is critical. By includingFirstNameandLastNamein the index, the query can get all the data it needs directly from the index leaf nodes without having to go back to the main table (the “Key Lookup”). This can turn a 500ms query into a 5ms query. - SARGable Queries: “Search ARGument-able”. If you use a function on a column in your
WHEREclause (e.g.,WHERE YEAR(JoinDate) = 2024), the database CANNOT use an index onJoinDate. It must scan every row. - Write Penalty: Every index you add makes your
INSERTandUPDATEoperations slower. In high-traffic write systems, “Over-indexing” is as dangerous as “Under-indexing”.
3. Read Scale-Out
3. Read Scale-Out
Use read-only replica for reporting:Available in: Business Critical and Premium tiers
3. Azure Cosmos DB
Cosmos DB is a globally distributed, multi-model NoSQL database.The PACELC Trade-off: Why so many choices?
In a distributed system, you don’t just have CAP (Consistency, Availability, Partition Tolerance). You also have PACELC:- If there is a Partition, you choose between Availability or Consistency.
- Else, you choose between Latency or Consistency.
| Level | Consistency | Latency | RU Cost | Use Case |
|---|---|---|---|---|
| Strong | Highest | Highest | 2.0x | FinTech / Transfers |
| Bounded | Guaranteed Lag | Medium | 1.0x | Stock Tickers |
| Session | Read-Your-Own-Write | Low | 1.0x | Social Media / Cart |
| Prefix | Ordering Guaranteed | Lowest | 1.0x | Chat / Comments |
| Eventual | No Guarantees | Lowest | 1.0x | View Analytics |
[!WARNING] Performance Gotcha: The Strong Consistency Tax Strong consistency requires a majority of replicas to ACK a write before success. If you have replicas in US, Europe, and Asia, a “Strong” write will take hundreds of milliseconds (speed of light limit). NEVER use Strong consistency for global applications unless absolutely necessary.
Partitioning Strategy
Partition Key is the most critical design decision in Cosmos DB.- Good Partition Keys
- Bad Partition Keys
- Partition Key Patterns
✅ High cardinality (many unique values):✅ Even distribution (no hot partitions)
✅ Natural query pattern (most queries filter by this)
Request Units (RU/s)
RUs are the currency of Cosmos DB.- RU Consumption
- Provisioning Models
- Optimize RU Consumption
Global Distribution
4. PostgreSQL & MySQL
Azure Database for PostgreSQL/MySQL are fully managed open-source databases.Deployment Options
- Single Server
- Flexible Server (Recommended)
Basic, simple deployment
High Availability
Read Replicas
5. Database Comparison
| Feature | Azure SQL | Cosmos DB | PostgreSQL | MySQL |
|---|---|---|---|---|
| Model | Relational | NoSQL | Relational | Relational |
| Max Size | 100 TB | Unlimited | 16 TB | 16 TB |
| ACID | Full | Tunable | Full | Full |
| Global Distribution | Active geo-replication | Multi-master | Read replicas | Read replicas |
| SLA | 99.995% | 99.999% | 99.99% | 99.99% |
| Use Case | OLTP, complex queries | Global apps, massive scale | Open-source, extensions | Open-source, web apps |
| Cost | $$$ | $$$$ | $ | $ |
6. Interview Questions
Beginner
Q1: When would you choose Cosmos DB over Azure SQL?
Q1: When would you choose Cosmos DB over Azure SQL?
Choose Cosmos DB when:
- Global distribution (multi-region writes)
- Massive scale (>1 TB, millions of requests/sec)
- Low latency (<10ms reads)
- Schema flexibility (JSON documents)
- Eventual consistency acceptable
- ACID transactions required
- Complex JOINs and relations
- Strong consistency mandatory
- Existing SQL code/expertise
- Cost-sensitive (Cosmos DB more expensive)
Q2: Explain DTU vs vCore pricing models
Q2: Explain DTU vs vCore pricing models
DTU (Database Transaction Unit):
- Bundled CPU + memory + I/O
- Simpler (choose tier: Basic/Standard/Premium)
- Less control over resources
- Use for: Simple workloads, getting started
- Choose CPU cores, memory, storage separately
- More control and flexibility
- Better for complex workloads
- Use for: Production, specific requirements
Intermediate
Q3: Design a globally distributed e-commerce database
Q3: Design a globally distributed e-commerce database
Q4: Optimize Cosmos DB for cost
Q4: Optimize Cosmos DB for cost
Strategies:
Advanced
Q5: Implement multi-tenant database isolation
Q5: Implement multi-tenant database isolation
Strategy Options:
Troubleshooting: When Databases Slow Down
When your app feels slow, it’s almost always the database. Use this playbook to find the culprit.1. Azure SQL: “The Query Timeout”
- Blocking: Check for long-running transactions that are locking rows. Use
sys.dm_tran_locks. - CPU Throttling: If your compute utilization is at 100%, Azure will “throttle” your queries. The solution is usually vertical scaling (more vCores) or fixing a missing index.
- Connection pooling: Is your app opening a new connection for every request? This is expensive. Use a connection pool (like HikariCP or Entity Framework pooling).
2. Cosmos DB: “The 429 Throttle”
If you see Error 429 (Too Many Requests):- Hot Partition: You are hitting one partition too hard. Check the portal for “Normalized RU Consumption per Partition Key Range”.
- Cross-Partition Query: Are you querying without the Partition Key? That’s the most common cause of high RU consumption.
- Solution: Increase RU/s (expensive) or fix your data model (correct).
3. The “Transient Failure”
Cloud databases sometimes reset connections during maintenance.- The Fix: Implement Retry Logic in your application code. Most Azure SDKs have this built-in, but you must enable it. Use “Exponential Backoff” to avoid overwhelming the database after a reset.
[!TIP] Pro Tool: Query Store In Azure SQL, enable Query Store. It’s a “black box flight recorder” for your database that tracks every query’s performance history. It can even automatically roll back a bad execution plan!
7. Key Takeaways
Choose Right Database
Azure SQL for relational, Cosmos DB for global scale, PostgreSQL for open-source.
Partition Key is Critical
In Cosmos DB, partition key determines performance and cost. Design carefully.
Geo-Replication for DR
Use auto-failover groups (SQL) or multi-region writes (Cosmos DB) for resilience.
Monitor and Optimize
Query Performance Insights, missing indexes, RU consumption monitoring.
Security Layers
Private endpoints, Azure AD auth, encryption at rest, audit logging.
Cost Optimization
Right-size throughput, optimize queries, use read replicas, implement TTL.
Next Steps
Continue to Chapter 7
Master Azure Kubernetes Service (AKS) and container orchestration