Chapter 6: Advanced DynamoDB Features
Introduction
Beyond basic CRUD operations, DynamoDB provides powerful advanced features that enable sophisticated application architectures. This chapter explores DynamoDB Streams, Transactions, Global Tables, TTL, and other advanced capabilities that differentiate DynamoDB from traditional databases.DynamoDB Streams
Overview
DynamoDB Streams captures time-ordered sequences of item-level modifications (create, update, delete) and stores this information for up to 24 hours.Stream View Types
Lambda Stream Processing
Stream Use Cases
1. Cross-Region ReplicationDynamoDB Transactions
ACID Transactions
DynamoDB supports atomic, consistent, isolated, and durable transactions across multiple items and tables.Deep Dive: Transactional Isolation and RDBMS Comparison
While DynamoDB provides ACID transactions, the underlying implementation and isolation guarantees differ from traditional SQL databases like PostgreSQL or MySQL.1. Isolation Levels
In a traditional RDBMS, you can choose betweenRead Uncommitted, Read Committed, Repeatable Read, and Serializable.
- DynamoDB Guarantee: DynamoDB provides Serializability for all items within a single transaction.
- The Catch: This serializability is achieved via Optimistic Concurrency Control (OCC) at the partition level. If two transactions attempt to modify the same item simultaneously, one will succeed and the other will be rejected with a
TransactionCanceledException.
2. The 2-Phase Commit (2PC) under the Hood
DynamoDB uses a specialized 2-Phase Commit protocol managed by a “Transaction Coordinator.”- Phase 1 (Prepare): The coordinator validates conditions for all items in the transaction and “locks” them.
- Phase 2 (Commit): If all conditions pass, the coordinator writes the changes. If any fail, it rolls back all changes.
- Latency Cost: Because it involves multiple round-trips between the coordinator and storage nodes, transactions have higher latency than standard
PutItemorUpdateItemoperations.
3. Key Differences from RDBMS
| Feature | DynamoDB Transactions | Traditional RDBMS (SQL) |
|---|---|---|
| Concurrency | Optimistic (Fail fast on conflict) | Pessimistic (Wait for locks) |
| Deadlocks | Impossible (Uses timestamps/OCC) | Possible (Requires deadlock detection) |
| Rollback | Automatic on condition failure | Manual or automatic |
| Scope | Up to 100 items | Entire database/tables |
| Throughput | Limited per partition | Scalable with hardware/sharding |
Transaction Constraints
Transaction Use Cases
Order ProcessingGlobal Tables
Multi-Region Replication
Global Tables provide managed multi-region, multi-active replication.Conflict Resolution
Deep Dive: Global Tables and Distributed Consistency
The “multi-active” nature of Global Tables introduces classic distributed systems challenges. While DynamoDB handles the replication, developers must understand the theoretical underpinnings.1. Last Writer Wins (LWW) vs. CRDTs
In the original 2007 Dynamo paper, conflict resolution was handled via Vector Clocks, allowing for complex merging or client-side reconciliation.- Modern DynamoDB Approach: For simplicity and performance, Global Tables use Last Writer Wins (LWW) based on a system-level wall clock (NTP-synchronized).
- The Trade-off: LWW is simpler but can lead to data loss in high-concurrency “write-write” conflict scenarios where the “losing” write is completely overwritten.
- CRDTs (Conflict-free Replicated Data Types): While DynamoDB doesn’t natively expose CRDTs, you can implement them at the application level (e.g., G-Counters or PN-Counters) using the
ADDoperation inUpdateItem, which is commutative.
2. Consistency Model: Eventual but Fast
Global Tables are Eventually Consistent across regions.- Replication Latency: Typically under 1 second globally.
- Theoretical Limit: Because it’s an asynchronous replication model, Global Tables fall into the AP (Availability / Partition Tolerance) category of the CAP theorem. They prioritize being able to write to any region over immediate global consistency.
3. Preventing Replication Loops
DynamoDB uses internal metadata to track the origin region of a write. This ensures that a write replicated from Region A to Region B does not get “re-replicated” back to Region A, preventing infinite loops.Time To Live (TTL)
Automatic Item Expiration
Point-in-Time Recovery (PITR)
PartiQL Support
SQL-Compatible Query Language
Contributor Insights
Identifying Hot Keys
Interview Questions and Answers
Question 1: How do DynamoDB Streams differ from Kinesis Data Streams?
Answer: DynamoDB Streams:- Captures item-level changes in DynamoDB tables
- Retention: 24 hours (fixed)
- Ordering: Per partition key only
- Shards: Managed automatically
- Cost: Included with table (no extra charge for reads)
- Use case: React to DynamoDB changes
- General-purpose streaming service
- Retention: 24 hours to 365 days (configurable)
- Ordering: Per partition key (shard key)
- Shards: Manual management required
- Cost: Per shard-hour + PUT payload units
- Use case: Real-time data ingestion, custom streaming
Question 2: When should you use transactions vs batch operations?
Answer: Use Transactions When:- Need ACID guarantees (all-or-nothing)
- Cross-item consistency required
- Conditional writes across multiple items
- Financial operations, inventory management
- Independent operations (no dependencies)
- Can tolerate partial failures
- Higher throughput needed
- Cost optimization (batch is cheaper)
Question 3: Explain conflict resolution in Global Tables.
Answer: Global Tables use last-writer-wins conflict resolution based on timestamps. How it works:- Each write includes an internal timestamp
- During concurrent writes to same item in different regions
- The write with the latest timestamp wins
- Losing write is discarded
Question 4: How do you implement audit logging with DynamoDB Streams?
Answer:Question 5: What are the limitations of DynamoDB transactions?
Answer: Hard Limits:- Max 100 items per TransactWriteItems
- Max 25 items per TransactGetItems
- Max 4 MB total transaction size
- All items must be in same AWS account/region
- Cannot mix transactions across tables with different billing modes (before 2020)
- Consumes 2x capacity units
- TransactWriteItems: 2 WCUs per KB per item
- TransactGetItems: 2 RCUs per 4KB per item
- Higher latency than non-transactional operations
- Limited throughput per partition
Question 6: How does TTL work and what are its limitations?
Answer: How TTL Works:- Enable TTL on a table attribute
- Set attribute to Unix timestamp (seconds since epoch)
- DynamoDB automatically deletes items within 48 hours after expiration
- Deletions are eventually consistent
- No additional cost for TTL deletions
- Deletion within 48 hours (not immediate)
- Cannot guarantee exact deletion time
- TTL attribute must be Number type (Unix timestamp in seconds)
- Deleted items still consume storage until actually deleted
- GSIs are updated after base table deletion
Question 7: Design a real-time leaderboard using DynamoDB Streams.
Answer: Architecture:- DynamoDB stores user scores
- Stream captures score updates
- Lambda updates Redis sorted set
- Application reads from Redis for rankings
- Sub-millisecond leaderboard queries
- Millions of score updates/sec
- Real-time ranking updates
- Scalable to billions of users
Summary
Advanced Features Overview:-
DynamoDB Streams:
- Captures item-level changes
- Powers real-time processing
- 24-hour retention
- Use for: replication, aggregations, notifications
-
Transactions:
- ACID guarantees
- Up to 100 items (writes) or 25 items (reads)
- 2x capacity cost
- Use for: financial operations, multi-item consistency
-
Global Tables:
- Multi-region, multi-active replication
- Sub-second replication latency
- Last-writer-wins conflict resolution
- Use for: global applications, disaster recovery
-
TTL:
- Automatic item expiration
- No additional cost
- Deletion within 48 hours
- Use for: sessions, temporary data, caching
-
PITR:
- 35-day recovery window
- Point-in-time restore
- Continuous backups
- Use for: disaster recovery, compliance