Chapter 3: Data Modeling in DynamoDB

Introduction

Data modeling in DynamoDB represents a fundamental paradigm shift from traditional relational database design. Rather than normalizing data and using joins, DynamoDB requires pre-joining data and denormalizing based on access patterns. This chapter explores the principles, patterns, and best practices for effective DynamoDB data modeling.

Understanding NoSQL Data Modeling

The Paradigm Shift

Traditional relational databases follow a schema-first approach where you design normalized tables and then write queries. DynamoDB inverts this model:

Identify Access Patterns First: Understand all the ways your application needs to read and write data
Design for Access Patterns: Structure your data to support these patterns efficiently
Denormalize and Pre-Join: Store data in a way that minimizes the number of requests needed

<svg viewBox="0 0 800 500" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="400" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    Relational vs DynamoDB Modeling Approach
  </text>

  <!-- Relational Side -->
  <rect x="50" y="60" width="300" height="400" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
  <text x="200" y="90" font-size="16" font-weight="bold" text-anchor="middle" fill="#1976d2">
    Relational (SQL)
  </text>

  <!-- Step 1 -->
  <rect x="70" y="110" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="200" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    1. Design Schema
  </text>
  <text x="200" y="155" font-size="12" text-anchor="middle" fill="#666">
    Normalize tables, define relationships
  </text>

  <!-- Arrow -->
  <path d="M 200 170 L 200 190" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 2 -->
  <rect x="70" y="190" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="200" y="215" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    2. Write Queries
  </text>
  <text x="200" y="235" font-size="12" text-anchor="middle" fill="#666">
    Use JOINs to combine data
  </text>

  <!-- Arrow -->
  <path d="M 200 250 L 200 270" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 3 -->
  <rect x="70" y="270" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="200" y="295" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    3. Add Indexes
  </text>
  <text x="200" y="315" font-size="12" text-anchor="middle" fill="#666">
    Optimize slow queries
  </text>

  <!-- Arrow -->
  <path d="M 200 330 L 200 350" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Result -->
  <ellipse cx="200" cy="390" rx="120" ry="40" fill="#4caf50" stroke="#2e7d32" stroke-width="2"/>
  <text x="200" y="395" font-size="13" font-weight="bold" text-anchor="middle" fill="#fff">
    Flexible Queries
  </text>
  <text x="200" y="410" font-size="11" text-anchor="middle" fill="#fff">
    Variable Performance
  </text>

  <!-- DynamoDB Side -->
  <rect x="450" y="60" width="300" height="400" fill="#fff3e0" stroke="#f57c00" stroke-width="2" rx="5"/>
  <text x="600" y="90" font-size="16" font-weight="bold" text-anchor="middle" fill="#f57c00">
    DynamoDB (NoSQL)
  </text>

  <!-- Step 1 -->
  <rect x="470" y="110" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="600" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    1. Identify Access Patterns
  </text>
  <text x="600" y="155" font-size="12" text-anchor="middle" fill="#666">
    List all query requirements
  </text>

  <!-- Arrow -->
  <path d="M 600 170 L 600 190" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 2 -->
  <rect x="470" y="190" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="600" y="215" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    2. Design Table Structure
  </text>
  <text x="600" y="235" font-size="12" text-anchor="middle" fill="#666">
    Denormalize, pre-join data
  </text>

  <!-- Arrow -->
  <path d="M 600 250 L 600 270" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 3 -->
  <rect x="470" y="270" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="600" y="295" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    3. Validate All Patterns
  </text>
  <text x="600" y="315" font-size="12" text-anchor="middle" fill="#666">
    Ensure efficient access
  </text>

  <!-- Arrow -->
  <path d="M 600 330 L 600 350" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Result -->
  <ellipse cx="600" cy="390" rx="120" ry="40" fill="#4caf50" stroke="#2e7d32" stroke-width="2"/>
  <text x="600" y="395" font-size="13" font-weight="bold" text-anchor="middle" fill="#fff">
    Fixed Access Patterns
  </text>
  <text x="600" y="410" font-size="11" text-anchor="middle" fill="#fff">
    Predictable Performance
  </text>

  <!-- Arrow marker definition -->
  <defs>
    <marker id="arrowhead" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
      <polygon points="0 0, 10 3, 0 6" fill="#666"/>
    </marker>
  </defs>
</svg>

Key Principles

Principle 1: Understand Your Access Patterns Before designing your data model, document every way your application needs to access data:

Example: E-commerce Application Access Patterns

Get user profile by userId
Get all orders for a user (sorted by date)
Get order details by orderId
Get all items in an order
Get product details by productId
Get all products in a category
Get product inventory by warehouseId
Get user's shopping cart
Get order status history
Search orders by status

Principle 2: Minimize Request Count Design your table to retrieve related data in a single request:

// ANTI-PATTERN: Multiple requests (SQL approach)
const user = await getUser(userId);
const orders = await getOrders(userId);
const addresses = await getAddresses(userId);

// GOOD PATTERN: Single request (DynamoDB approach)
const userProfile = await query({
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': `USER#${userId}`
  }
});
// Returns: user metadata, orders, addresses in one response

Principle 3: Use Composite Keys Partition keys (PK) and sort keys (SK) should be composite values that enable efficient queries:

// Composite key examples
PK: 'USER#12345'
SK: 'ORDER#2024-01-15#54321'

PK: 'PRODUCT#98765'
SK: 'REVIEW#2024-01-10#USER#12345'

PK: 'ORG#ACME'
SK: 'EMPLOYEE#DEPT#ENGINEERING#EMP#67890'

Single-Table Design

What is Single-Table Design?

Single-table design is a DynamoDB modeling technique where you store multiple entity types in one table, using generic attribute names (PK, SK, GSI1PK, GSI1SK) and item type identifiers.

<svg viewBox="0 0 900 600" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="450" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    Single-Table Design Structure
  </text>

  <!-- Main Table -->
  <rect x="50" y="60" width="800" height="500" fill="#f5f5f5" stroke="#333" stroke-width="2" rx="5"/>
  <text x="450" y="90" font-size="16" font-weight="bold" text-anchor="middle" fill="#333">
    Application Table
  </text>

  <!-- Table Header -->
  <rect x="70" y="110" width="760" height="40" fill="#1976d2" stroke="#0d47a1" stroke-width="1"/>
  <text x="150" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">PK</text>
  <text x="300" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">SK</text>
  <text x="450" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">Type</text>
  <text x="600" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">Attributes</text>
  <text x="750" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">GSI1PK</text>

  <!-- User Entity -->
  <rect x="70" y="150" width="760" height="35" fill="#e3f2fd" stroke="#90caf9" stroke-width="1"/>
  <text x="150" y="172" font-size="12" text-anchor="middle" fill="#333">USER#123</text>
  <text x="300" y="172" font-size="12" text-anchor="middle" fill="#333">METADATA</text>
  <text x="450" y="172" font-size="12" text-anchor="middle" fill="#333">User</text>
  <text x="600" y="172" font-size="11" text-anchor="middle" fill="#666">name, email...</text>
  <text x="750" y="172" font-size="12" text-anchor="middle" fill="#333">EMAIL#[email protected]</text>

  <!-- Order Entity -->
  <rect x="70" y="185" width="760" height="35" fill="#fff3e0" stroke="#ffb74d" stroke-width="1"/>
  <text x="150" y="207" font-size="12" text-anchor="middle" fill="#333">USER#123</text>
  <text x="300" y="207" font-size="12" text-anchor="middle" fill="#333">ORDER#2024-01-15</text>
  <text x="450" y="207" font-size="12" text-anchor="middle" fill="#333">Order</text>
  <text x="600" y="207" font-size="11" text-anchor="middle" fill="#666">total, status...</text>
  <text x="750" y="207" font-size="12" text-anchor="middle" fill="#333">STATUS#PENDING</text>

  <!-- Order Items -->
  <rect x="70" y="220" width="760" height="35" fill="#fff3e0" stroke="#ffb74d" stroke-width="1"/>
  <text x="150" y="242" font-size="12" text-anchor="middle" fill="#333">ORDER#001</text>
  <text x="300" y="242" font-size="12" text-anchor="middle" fill="#333">ITEM#PROD#456</text>
  <text x="450" y="242" font-size="12" text-anchor="middle" fill="#333">OrderItem</text>
  <text x="600" y="242" font-size="11" text-anchor="middle" fill="#666">qty, price...</text>
  <text x="750" y="242" font-size="12" text-anchor="middle" fill="#333">PRODUCT#456</text>

  <!-- Product Entity -->
  <rect x="70" y="255" width="760" height="35" fill="#e8f5e9" stroke="#81c784" stroke-width="1"/>
  <text x="150" y="277" font-size="12" text-anchor="middle" fill="#333">PRODUCT#456</text>
  <text x="300" y="277" font-size="12" text-anchor="middle" fill="#333">METADATA</text>
  <text x="450" y="277" font-size="12" text-anchor="middle" fill="#333">Product</text>
  <text x="600" y="277" font-size="11" text-anchor="middle" fill="#666">name, desc...</text>
  <text x="750" y="277" font-size="12" text-anchor="middle" fill="#333">CATEGORY#BOOKS</text>

  <!-- Review Entity -->
  <rect x="70" y="290" width="760" height="35" fill="#e8f5e9" stroke="#81c784" stroke-width="1"/>
  <text x="150" y="312" font-size="12" text-anchor="middle" fill="#333">PRODUCT#456</text>
  <text x="300" y="312" font-size="12" text-anchor="middle" fill="#333">REVIEW#USER#123</text>
  <text x="450" y="312" font-size="12" text-anchor="middle" fill="#333">Review</text>
  <text x="600" y="312" font-size="11" text-anchor="middle" fill="#666">rating, text...</text>
  <text x="750" y="312" font-size="12" text-anchor="middle" fill="#333">USER#123</text>

  <!-- More entities indicator -->
  <text x="450" y="345" font-size="14" text-anchor="middle" fill="#666" font-style="italic">
    ... more entities ...
  </text>

  <!-- Benefits Box -->
  <rect x="90" y="380" width="340" height="150" fill="#fff" stroke="#4caf50" stroke-width="2" rx="5"/>
  <text x="260" y="410" font-size="15" font-weight="bold" text-anchor="middle" fill="#4caf50">
    Benefits
  </text>
  <text x="110" y="440" font-size="12" fill="#333">
    ✓ Single table = simpler operations
  </text>
  <text x="110" y="465" font-size="12" fill="#333">
    ✓ Related data in one partition
  </text>
  <text x="110" y="490" font-size="12" fill="#333">
    ✓ Atomic transactions possible
  </text>
  <text x="110" y="515" font-size="12" fill="#333">
    ✓ Fewer network round trips
  </text>

  <!-- Challenges Box -->
  <rect x="470" y="380" width="340" height="150" fill="#fff" stroke="#ff9800" stroke-width="2" rx="5"/>
  <text x="640" y="410" font-size="15" font-weight="bold" text-anchor="middle" fill="#ff9800">
    Challenges
  </text>
  <text x="490" y="440" font-size="12" fill="#333">
    ⚠ Requires upfront access patterns
  </text>
  <text x="490" y="465" font-size="12" fill="#333">
    ⚠ Schema less obvious
  </text>
  <text x="490" y="490" font-size="12" fill="#333">
    ⚠ Harder to understand initially
  </text>
  <text x="490" y="515" font-size="12" fill="#333">
    ⚠ Limited ad-hoc queries
  </text>
</svg>

Advantages of Single-Table Design

Reduced Latency: Fetch related items in one request
Atomic Operations: Transaction support across entity types
Cost Efficiency: Fewer requests = lower costs
Simplified Operations: Manage one table instead of many

Single-Table Design Example

Let’s model an e-commerce application with users, orders, products, and reviews:

// Entity Relationships
User (1) ----< (N) Orders
Order (1) ----< (N) OrderItems
Product (1) ----< (N) Reviews
Product (1) ----< (N) OrderItems

// Table Design
{
  TableName: 'EcommerceApp',
  KeySchema: [
    { AttributeName: 'PK', KeyType: 'HASH' },
    { AttributeName: 'SK', KeyType: 'RANGE' }
  ],
  AttributeDefinitions: [
    { AttributeName: 'PK', AttributeType: 'S' },
    { AttributeName: 'SK', AttributeType: 'S' },
    { AttributeName: 'GSI1PK', AttributeType: 'S' },
    { AttributeName: 'GSI1SK', AttributeType: 'S' }
  ],
  GlobalSecondaryIndexes: [
    {
      IndexName: 'GSI1',
      KeySchema: [
        { AttributeName: 'GSI1PK', KeyType: 'HASH' },
        { AttributeName: 'GSI1SK', KeyType: 'RANGE' }
      ],
      Projection: { ProjectionType: 'ALL' }
    }
  ]
}

// Sample Items

// User
{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  userId: '12345',
  name: 'John Doe',
  email: '[email protected]',
  createdAt: '2024-01-01T00:00:00Z',
  GSI1PK: 'EMAIL#[email protected]',
  GSI1SK: 'USER#12345'
}

// User's Order
{
  PK: 'USER#12345',
  SK: 'ORDER#2024-01-15#ORD-001',
  Type: 'Order',
  orderId: 'ORD-001',
  userId: '12345',
  orderDate: '2024-01-15T10:30:00Z',
  total: 99.99,
  status: 'DELIVERED',
  GSI1PK: 'STATUS#DELIVERED',
  GSI1SK: '2024-01-15#ORD-001'
}

// Order Details (duplicated for efficient access)
{
  PK: 'ORDER#ORD-001',
  SK: 'METADATA',
  Type: 'Order',
  orderId: 'ORD-001',
  userId: '12345',
  orderDate: '2024-01-15T10:30:00Z',
  total: 99.99,
  status: 'DELIVERED',
  shippingAddress: '123 Main St',
  GSI1PK: 'USER#12345',
  GSI1SK: 'ORDER#2024-01-15#ORD-001'
}

// Order Item
{
  PK: 'ORDER#ORD-001',
  SK: 'ITEM#PRODUCT#PROD-789',
  Type: 'OrderItem',
  orderId: 'ORD-001',
  productId: 'PROD-789',
  productName: 'Wireless Mouse',
  quantity: 2,
  price: 29.99,
  GSI1PK: 'PRODUCT#PROD-789',
  GSI1SK: 'ORDER#ORD-001'
}

// Product
{
  PK: 'PRODUCT#PROD-789',
  SK: 'METADATA',
  Type: 'Product',
  productId: 'PROD-789',
  name: 'Wireless Mouse',
  description: 'Ergonomic wireless mouse',
  price: 29.99,
  category: 'Electronics',
  inventory: 150,
  GSI1PK: 'CATEGORY#Electronics',
  GSI1SK: 'PRODUCT#PROD-789'
}

// Product Review
{
  PK: 'PRODUCT#PROD-789',
  SK: 'REVIEW#2024-01-16#USER#12345',
  Type: 'Review',
  productId: 'PROD-789',
  userId: '12345',
  userName: 'John Doe',
  rating: 5,
  reviewText: 'Great mouse!',
  reviewDate: '2024-01-16T15:20:00Z',
  GSI1PK: 'USER#12345',
  GSI1SK: 'REVIEW#2024-01-16'
}

Querying the Single-Table Design

// Access Pattern 1: Get user profile
const getUserProfile = async (userId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    KeyConditionExpression: 'PK = :pk AND SK = :sk',
    ExpressionAttributeValues: {
      ':pk': `USER#${userId}`,
      ':sk': 'METADATA'
    }
  });
};

// Access Pattern 2: Get all orders for a user (sorted by date)
const getUserOrders = async (userId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
    ExpressionAttributeValues: {
      ':pk': `USER#${userId}`,
      ':sk': 'ORDER#'
    },
    ScanIndexForward: false // Most recent first
  });
};

// Access Pattern 3: Get order with all items
const getOrderDetails = async (orderId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    KeyConditionExpression: 'PK = :pk',
    ExpressionAttributeValues: {
      ':pk': `ORDER#${orderId}`
    }
  });
  // Returns order metadata + all order items in one query
};

// Access Pattern 4: Get product with reviews
const getProductWithReviews = async (productId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    KeyConditionExpression: 'PK = :pk',
    ExpressionAttributeValues: {
      ':pk': `PRODUCT#${productId}`
    }
  });
  // Returns product metadata + all reviews in one query
};

// Access Pattern 5: Get products by category (using GSI)
const getProductsByCategory = async (category) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    IndexName: 'GSI1',
    KeyConditionExpression: 'GSI1PK = :gsi1pk',
    ExpressionAttributeValues: {
      ':gsi1pk': `CATEGORY#${category}`
    }
  });
};

// Access Pattern 6: Get orders by status (using GSI)
const getOrdersByStatus = async (status) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    IndexName: 'GSI1',
    KeyConditionExpression: 'GSI1PK = :gsi1pk',
    ExpressionAttributeValues: {
      ':gsi1pk': `STATUS#${status}`
    },
    ScanIndexForward: false // Most recent first
  });
};

// Access Pattern 7: Get user by email (using GSI)
const getUserByEmail = async (email) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    IndexName: 'GSI1',
    KeyConditionExpression: 'GSI1PK = :gsi1pk',
    ExpressionAttributeValues: {
      ':gsi1pk': `EMAIL#${email}`
    }
  });
};

// Access Pattern 8: Get user's reviews (using GSI)
const getUserReviews = async (userId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    IndexName: 'GSI1',
    KeyConditionExpression: 'GSI1PK = :gsi1pk AND begins_with(GSI1SK, :gsi1sk)',
    ExpressionAttributeValues: {
      ':gsi1pk': `USER#${userId}`,
      ':gsi1sk': 'REVIEW#'
    },
    ScanIndexForward: false
  });
};

Advanced Data Modeling Patterns

Pattern 1: Hierarchical Data

Model hierarchical relationships using composite sort keys:

// Organization Structure: Org -> Departments -> Teams -> Employees

// Organization
{
  PK: 'ORG#ACME',
  SK: 'METADATA',
  Type: 'Organization',
  name: 'ACME Corp',
  industry: 'Technology'
}

// Department
{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering',
  Type: 'Department',
  deptName: 'Engineering',
  budget: 5000000
}

// Team
{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering#TEAM#Backend',
  Type: 'Team',
  teamName: 'Backend Team',
  deptName: 'Engineering'
}

// Employee
{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering#TEAM#Backend#EMP#12345',
  Type: 'Employee',
  employeeId: '12345',
  name: 'Alice Johnson',
  role: 'Senior Engineer',
  GSI1PK: 'EMPLOYEE#12345',
  GSI1SK: 'METADATA'
}

// Query all departments
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'ORG#ACME',
    ':sk': 'DEPT#'
  }
});

// Query all teams in Engineering
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'ORG#ACME',
    ':sk': 'DEPT#Engineering#TEAM#'
  }
});

// Query all employees in Backend team
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'ORG#ACME',
    ':sk': 'DEPT#Engineering#TEAM#Backend#EMP#'
  }
});

Pattern 2: Many-to-Many Relationships

Use adjacency list pattern for many-to-many relationships:

// Students and Classes (many-to-many)

// Student
{
  PK: 'STUDENT#S001',
  SK: 'METADATA',
  Type: 'Student',
  studentId: 'S001',
  name: 'John Smith',
  email: '[email protected]'
}

// Student -> Class relationship
{
  PK: 'STUDENT#S001',
  SK: 'CLASS#C101',
  Type: 'Enrollment',
  studentId: 'S001',
  classId: 'C101',
  className: 'Database Systems',
  enrolledDate: '2024-01-10',
  grade: null,
  GSI1PK: 'CLASS#C101',
  GSI1SK: 'STUDENT#S001'
}

// Class
{
  PK: 'CLASS#C101',
  SK: 'METADATA',
  Type: 'Class',
  classId: 'C101',
  name: 'Database Systems',
  instructor: 'Dr. Smith',
  capacity: 30
}

// Class -> Student relationship (inverse)
{
  PK: 'CLASS#C101',
  SK: 'STUDENT#S001',
  Type: 'Enrollment',
  studentId: 'S001',
  studentName: 'John Smith',
  classId: 'C101',
  enrolledDate: '2024-01-10',
  grade: null,
  GSI1PK: 'STUDENT#S001',
  GSI1SK: 'CLASS#C101'
}

// Get all classes for a student
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'STUDENT#S001',
    ':sk': 'CLASS#'
  }
});

// Get all students in a class
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'CLASS#C101',
    ':sk': 'STUDENT#'
  }
});

Pattern 3: Time-Series Data

Efficiently store and query time-series data:

// IoT Sensor Data

// Strategy 1: Date-based partitioning
{
  PK: 'SENSOR#S123#DATE#2024-01-15',
  SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z',
  Type: 'SensorReading',
  sensorId: 'S123',
  temperature: 72.5,
  humidity: 45.2,
  pressure: 1013.25,
  timestamp: '2024-01-15T10:30:45.123Z'
}

// Strategy 2: Hour-based partitioning (more granular)
{
  PK: 'SENSOR#S123#HOUR#2024-01-15-10',
  SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z',
  Type: 'SensorReading',
  sensorId: 'S123',
  temperature: 72.5,
  timestamp: '2024-01-15T10:30:45.123Z',
  GSI1PK: 'SENSOR#S123',
  GSI1SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z'
}

// Query readings for a specific hour
dynamodb.query({
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': 'SENSOR#S123#HOUR#2024-01-15-10'
  }
});

// Query readings in a time range (using GSI)
dynamodb.query({
  IndexName: 'GSI1',
  KeyConditionExpression: 'GSI1PK = :pk AND GSI1SK BETWEEN :start AND :end',
  ExpressionAttributeValues: {
    ':pk': 'SENSOR#S123',
    ':start': 'TIMESTAMP#2024-01-15T09:00:00.000Z',
    ':end': 'TIMESTAMP#2024-01-15T11:00:00.000Z'
  }
});

// Strategy 3: Aggregated data with TTL
{
  PK: 'SENSOR#S123#AGG#HOURLY',
  SK: 'HOUR#2024-01-15-10',
  Type: 'AggregatedReading',
  sensorId: 'S123',
  hour: '2024-01-15-10',
  avgTemperature: 72.3,
  minTemperature: 70.1,
  maxTemperature: 74.5,
  readingCount: 120,
  ttl: 1737849600 // Expire after 30 days
}

Pattern 4: Composite Sort Key Patterns

Use composite sort keys to enable multiple query patterns:

// E-commerce Product Catalog

// Product with composite SK for filtering
{
  PK: 'CATEGORY#Electronics',
  SK: 'BRAND#Apple#PRICE#0999.99#PRODUCT#iPhone15',
  Type: 'Product',
  productId: 'iPhone15',
  name: 'iPhone 15',
  brand: 'Apple',
  price: 999.99,
  rating: 4.8,
  inStock: true,
  GSI1PK: 'PRODUCT#iPhone15',
  GSI1SK: 'METADATA'
}

// Query products by category and brand
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'CATEGORY#Electronics',
    ':sk': 'BRAND#Apple#'
  }
});

// Query products by category, brand, and price range
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND SK BETWEEN :start AND :end',
  ExpressionAttributeValues: {
    ':pk': 'CATEGORY#Electronics',
    ':start': 'BRAND#Apple#PRICE#0500.00',
    ':end': 'BRAND#Apple#PRICE#1000.00'
  }
});

Pattern 5: Sparse Indexes

Use sparse GSIs to index only items with specific attributes:

// Only create GSI keys for items that need to be queried

// Regular user (no special attributes)
{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  name: 'John Doe',
  email: '[email protected]'
  // No GSI1PK/SK - not indexed in GSI1
}

// Premium user (indexed for premium queries)
{
  PK: 'USER#67890',
  SK: 'METADATA',
  Type: 'User',
  name: 'Jane Smith',
  email: '[email protected]',
  isPremium: true,
  premiumSince: '2024-01-01',
  GSI1PK: 'PREMIUM_USERS',
  GSI1SK: '2024-01-01#USER#67890'
}

// Query only premium users (sparse index)
dynamodb.query({
  IndexName: 'GSI1',
  KeyConditionExpression: 'GSI1PK = :pk',
  ExpressionAttributeValues: {
    ':pk': 'PREMIUM_USERS'
  }
});

Data Modeling Best Practices

1. Avoid Hot Partitions

Distribute writes evenly across partitions:

// BAD: All writes go to same partition
{
  PK: 'GLOBAL_COUNTER',  // Hot partition!
  SK: 'COUNT',
  value: 1000000
}

// GOOD: Distribute using random suffix
const shardId = Math.floor(Math.random() * 10);
{
  PK: `COUNTER_SHARD#${shardId}`,
  SK: 'COUNT',
  value: 100000
}
// Read all shards and sum to get total

// GOOD: Use timestamp-based partitioning
{
  PK: `METRICS#${new Date().toISOString().split('T')[0]}`,
  SK: `TIMESTAMP#${Date.now()}`,
  metric: 'page_view'
}

2. Pre-compute and Denormalize

Store computed values to avoid expensive queries:

// Store aggregated data with detail data

// Order with pre-computed totals
{
  PK: 'ORDER#ORD-001',
  SK: 'METADATA',
  Type: 'Order',
  orderId: 'ORD-001',
  itemCount: 3,           // Pre-computed
  subtotal: 89.97,        // Pre-computed
  tax: 7.20,              // Pre-computed
  total: 97.17,           // Pre-computed
  status: 'COMPLETED'
}

// User with denormalized frequently-accessed data
{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  userId: '12345',
  name: 'John Doe',
  email: '[email protected]',
  orderCount: 25,         // Denormalized
  totalSpent: 2500.00,    // Denormalized
  lastOrderDate: '2024-01-15',  // Denormalized
  memberSince: '2023-01-01'
}

3. Use Item Collections Wisely

Keep related items together for efficient queries:

// Good: Related items share partition key
{
  PK: 'THREAD#T001',  // Discussion thread
  SK: 'METADATA',
  threadTitle: 'How to use DynamoDB?',
  createdBy: 'USER#123',
  createdAt: '2024-01-10'
}

{
  PK: 'THREAD#T001',
  SK: 'POST#2024-01-10T10:00:00#P001',
  postId: 'P001',
  author: 'USER#123',
  content: 'Original post content',
  timestamp: '2024-01-10T10:00:00Z'
}

{
  PK: 'THREAD#T001',
  SK: 'POST#2024-01-10T10:15:00#P002',
  postId: 'P002',
  author: 'USER#456',
  content: 'Reply to post',
  timestamp: '2024-01-10T10:15:00Z'
}

// Single query gets thread + all posts
dynamodb.query({
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': 'THREAD#T001'
  }
});

4. Version Your Schema

Include schema version for future migrations:

{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  SchemaVersion: '2.0',  // Track schema version
  userId: '12345',
  name: 'John Doe',
  // ... other attributes
}

// Handle different versions in application code
const handleUser = (item) => {
  switch (item.SchemaVersion) {
    case '1.0':
      return migrateFromV1(item);
    case '2.0':
      return item;
    default:
      throw new Error('Unknown schema version');
  }
};

5. Implement Soft Deletes

Use status flags instead of deleting items:

// Instead of deleting
{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  userId: '12345',
  name: 'John Doe',
  status: 'DELETED',           // Soft delete flag
  deletedAt: '2024-01-20',
  ttl: 1740787200              // Hard delete after 90 days
}

// Filter in queries
dynamodb.query({
  KeyConditionExpression: 'PK = :pk',
  FilterExpression: 'attribute_not_exists(#status) OR #status <> :deleted',
  ExpressionAttributeNames: {
    '#status': 'status'
  },
  ExpressionAttributeValues: {
    ':pk': 'USER#12345',
    ':deleted': 'DELETED'
  }
});

Multi-Table vs Single-Table Design

When to Use Multiple Tables

Single-table design isn’t always the answer. Consider multiple tables when:

Different Access Patterns: Entities have completely different access patterns
Different Scaling Requirements: Some entities need different throughput
Different Security Requirements: Different IAM policies per entity type
Different Backup Requirements: Different backup/restore needs
Team Boundaries: Different teams manage different entities

// Multi-table example: Different requirements

// Users Table (high read, low write)
{
  TableName: 'Users',
  BillingMode: 'PAY_PER_REQUEST',
  PointInTimeRecoveryEnabled: true
}

// Sessions Table (high read/write, TTL enabled)
{
  TableName: 'Sessions',
  BillingMode: 'PROVISIONED',
  ProvisionedThroughput: {
    ReadCapacityUnits: 5000,
    WriteCapacityUnits: 5000
  },
  TimeToLiveSpecification: {
    Enabled: true,
    AttributeName: 'ttl'
  }
}

// Analytics Table (write-heavy, different backup schedule)
{
  TableName: 'Analytics',
  BillingMode: 'PAY_PER_REQUEST',
  StreamSpecification: {
    StreamEnabled: true,
    StreamViewType: 'NEW_IMAGE'
  }
}

Common Anti-Patterns to Avoid

Anti-Pattern 1: Normalization

// WRONG: Normalized like SQL
// Users table
{ userId: '123', name: 'John' }

// Orders table
{ orderId: '001', userId: '123', total: 99.99 }

// OrderItems table
{ orderItemId: '1', orderId: '001', productId: '789' }

// Requires 3 queries to get order with items for a user!

// RIGHT: Denormalized
{
  PK: 'USER#123',
  SK: 'ORDER#001',
  orderId: '001',
  total: 99.99,
  items: [
    { productId: '789', quantity: 2, price: 49.99 }
  ]
}
// Single query gets everything

Anti-Pattern 2: Using Scan for Queries

// WRONG: Scanning entire table
const results = await dynamodb.scan({
  TableName: 'Products',
  FilterExpression: 'category = :cat',
  ExpressionAttributeValues: {
    ':cat': 'Electronics'
  }
});

// RIGHT: Query with proper key design
const results = await dynamodb.query({
  TableName: 'Products',
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': 'CATEGORY#Electronics'
  }
});

Anti-Pattern 3: Large Item Sizes

// WRONG: Storing large arrays (approaching 400KB limit)
{
  PK: 'USER#123',
  SK: 'METADATA',
  orderHistory: [/* 1000s of orders */]  // Too large!
}

// RIGHT: Separate items
{
  PK: 'USER#123',
  SK: 'ORDER#2024-01-15#001'
}
{
  PK: 'USER#123',
  SK: 'ORDER#2024-01-14#002'
}
// Query with pagination

Anti-Pattern 4: Not Using Sort Keys

// WRONG: Only using partition key
{
  PK: 'USER#123',
  // No SK - missing query capabilities
  name: 'John'
}

// RIGHT: Always use sort keys for flexibility
{
  PK: 'USER#123',
  SK: 'METADATA',
  name: 'John'
}

Data Modeling Workflow

<svg viewBox="0 0 800 900" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="400" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    DynamoDB Data Modeling Workflow
  </text>

  <!-- Step 1 -->
  <rect x="250" y="60" width="300" height="80" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
  <text x="400" y="90" font-size="14" font-weight="bold" text-anchor="middle" fill="#1976d2">
    Step 1: Identify Entities
  </text>
  <text x="400" y="115" font-size="12" text-anchor="middle" fill="#333">
    List all entity types in your application
  </text>
  <text x="400" y="130" font-size="11" text-anchor="middle" fill="#666">
    (Users, Orders, Products, etc.)
  </text>

  <path d="M 400 140 L 400 170" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 2 -->
  <rect x="250" y="170" width="300" height="80" fill="#fff3e0" stroke="#f57c00" stroke-width="2" rx="5"/>
  <text x="400" y="200" font-size="14" font-weight="bold" text-anchor="middle" fill="#f57c00">
    Step 2: Define Relationships
  </text>
  <text x="400" y="225" font-size="12" text-anchor="middle" fill="#333">
    Map entity relationships
  </text>
  <text x="400" y="240" font-size="11" text-anchor="middle" fill="#666">
    (1:1, 1:N, N:M)
  </text>

  <path d="M 400 250 L 400 280" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 3 -->
  <rect x="250" y="280" width="300" height="80" fill="#e8f5e9" stroke="#4caf50" stroke-width="2" rx="5"/>
  <text x="400" y="310" font-size="14" font-weight="bold" text-anchor="middle" fill="#4caf50">
    Step 3: List Access Patterns
  </text>
  <text x="400" y="335" font-size="12" text-anchor="middle" fill="#333">
    Document ALL read/write patterns
  </text>
  <text x="400" y="350" font-size="11" text-anchor="middle" fill="#666">
    Include frequency and latency requirements
  </text>

  <path d="M 400 360 L 400 390" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 4 -->
  <rect x="250" y="390" width="300" height="80" fill="#f3e5f5" stroke="#9c27b0" stroke-width="2" rx="5"/>
  <text x="400" y="420" font-size="14" font-weight="bold" text-anchor="middle" fill="#9c27b0">
    Step 4: Design Primary Keys
  </text>
  <text x="400" y="445" font-size="12" text-anchor="middle" fill="#333">
    Create PK/SK patterns for main patterns
  </text>
  <text x="400" y="460" font-size="11" text-anchor="middle" fill="#666">
    Support 1-to-N queries with sort keys
  </text>

  <path d="M 400 470 L 400 500" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 5 -->
  <rect x="250" y="500" width="300" height="80" fill="#fce4ec" stroke="#e91e63" stroke-width="2" rx="5"/>
  <text x="400" y="530" font-size="14" font-weight="bold" text-anchor="middle" fill="#e91e63">
    Step 5: Add GSIs for Inverted Access
  </text>
  <text x="400" y="555" font-size="12" text-anchor="middle" fill="#333">
    Design GSI keys for remaining patterns
  </text>
  <text x="400" y="570" font-size="11" text-anchor="middle" fill="#666">
    Minimize GSI count (max 20 per table)
  </text>

  <path d="M 400 580 L 400 610" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 6 -->
  <rect x="250" y="610" width="300" height="80" fill="#e0f2f1" stroke="#009688" stroke-width="2" rx="5"/>
  <text x="400" y="640" font-size="14" font-weight="bold" text-anchor="middle" fill="#009688">
    Step 6: Optimize Item Structure
  </text>
  <text x="400" y="665" font-size="12" text-anchor="middle" fill="#333">
    Denormalize, add computed fields
  </text>
  <text x="400" y="680" font-size="11" text-anchor="middle" fill="#666">
    Keep items under 400KB
  </text>

  <path d="M 400 690 L 400 720" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 7 -->
  <rect x="250" y="720" width="300" height="80" fill="#fff9c4" stroke="#fbc02d" stroke-width="2" rx="5"/>
  <text x="400" y="750" font-size="14" font-weight="bold" text-anchor="middle" fill="#f57f17">
    Step 7: Validate and Test
  </text>
  <text x="400" y="775" font-size="12" text-anchor="middle" fill="#333">
    Verify all access patterns are supported
  </text>
  <text x="400" y="790" font-size="11" text-anchor="middle" fill="#666">
    Test with realistic data volumes
  </text>

  <path d="M 400 800 L 400 830" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Final Result -->
  <ellipse cx="400" cy="860" rx="120" ry="30" fill="#4caf50" stroke="#2e7d32" stroke-width="2"/>
  <text x="400" y="870" font-size="13" font-weight="bold" text-anchor="middle" fill="#fff">
    Optimized Data Model
  </text>

  <!-- Arrow marker -->
  <defs>
    <marker id="arrowhead" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
      <polygon points="0 0, 10 3, 0 6" fill="#666"/>
    </marker>
  </defs>
</svg>

Migration Strategies

Migrating from SQL to DynamoDB

// SQL Schema
CREATE TABLE users (
  user_id INT PRIMARY KEY,
  email VARCHAR(255) UNIQUE,
  name VARCHAR(255)
);

CREATE TABLE orders (
  order_id INT PRIMARY KEY,
  user_id INT,
  order_date DATE,
  total DECIMAL(10,2),
  FOREIGN KEY (user_id) REFERENCES users(user_id)
);

CREATE TABLE order_items (
  order_item_id INT PRIMARY KEY,
  order_id INT,
  product_id INT,
  quantity INT,
  FOREIGN KEY (order_id) REFERENCES orders(order_id)
);

// DynamoDB Single-Table Design

// Access patterns:
// 1. Get user by ID
// 2. Get user by email
// 3. Get user's orders
// 4. Get order by ID
// 5. Get order items

// Items:
{
  PK: 'USER#123',
  SK: 'METADATA',
  Type: 'User',
  email: '[email protected]',
  name: 'John Doe',
  GSI1PK: 'EMAIL#[email protected]',
  GSI1SK: 'USER#123'
}

{
  PK: 'USER#123',
  SK: 'ORDER#2024-01-15#001',
  Type: 'Order',
  orderId: '001',
  orderDate: '2024-01-15',
  total: 99.99,
  GSI1PK: 'ORDER#001',
  GSI1SK: 'METADATA'
}

{
  PK: 'ORDER#001',
  SK: 'ITEM#1',
  Type: 'OrderItem',
  orderItemId: '1',
  productId: '789',
  quantity: 2
}

Performance Optimization Through Data Modeling

1. Reduce Item Size

// BEFORE: Verbose attribute names (wastes space)
{
  PK: 'USER#123',
  SK: 'METADATA',
  user_identifier: '123',
  user_full_name: 'John Doe',
  user_email_address: '[email protected]',
  user_registration_date: '2024-01-01'
}

// AFTER: Shortened names (saves space, reduces costs)
{
  PK: 'USER#123',
  SK: 'METADATA',
  uid: '123',
  name: 'John Doe',
  email: '[email protected]',
  regDate: '2024-01-01'
}

// BEFORE: Multiple queries
const user = await getUser(userId);
const orders = await getOrders(userId);
const addresses = await getAddresses(userId);

// AFTER: Single query
const userData = await dynamodb.query({
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': `USER#${userId}`
  }
});

// Parse results
const user = userData.Items.find(i => i.SK === 'METADATA');
const orders = userData.Items.filter(i => i.SK.startsWith('ORDER#'));
const addresses = userData.Items.filter(i => i.SK.startsWith('ADDRESS#'));

3. Use Projection Expressions

// Retrieve only needed attributes
const result = await dynamodb.query({
  TableName: 'EcommerceApp',
  KeyConditionExpression: 'PK = :pk',
  ProjectionExpression: 'SK, #name, email',
  ExpressionAttributeNames: {
    '#name': 'name'
  },
  ExpressionAttributeValues: {
    ':pk': 'USER#123'
  }
});

Interview Questions and Answers

Question 1: What is single-table design and when should you use it?

Answer: Single-table design is a DynamoDB modeling approach where you store multiple entity types in one table using generic attribute names (PK, SK, GSI1PK, etc.) and differentiate entities using a Type attribute. Benefits:

Retrieve related items in a single query
Support transactions across entity types
Reduce operational complexity
Lower costs through fewer requests

Use when:

Entities have related access patterns
You need transactions across entity types
You want to minimize request count
Your access patterns are well-defined

Avoid when:

Entities have completely different scaling needs
Different security requirements per entity
Teams need separate ownership
Access patterns are unknown or highly variable

Question 2: How do you model a many-to-many relationship in DynamoDB?

Answer: Use the adjacency list pattern by creating bidirectional edges:

// Student-Class many-to-many

// Student -> Class edge
{
  PK: 'STUDENT#S001',
  SK: 'CLASS#C101',
  Type: 'Enrollment',
  studentId: 'S001',
  classId: 'C101',
  GSI1PK: 'CLASS#C101',  // Allows reverse lookup
  GSI1SK: 'STUDENT#S001'
}

// Class -> Student edge
{
  PK: 'CLASS#C101',
  SK: 'STUDENT#S001',
  Type: 'Enrollment',
  classId: 'C101',
  studentId: 'S001',
  GSI1PK: 'STUDENT#S001',  // Allows reverse lookup
  GSI1SK: 'CLASS#C101'
}

// Query all classes for a student
query({ PK: 'STUDENT#S001', SK begins_with 'CLASS#' })

// Query all students in a class
query({ PK: 'CLASS#C101', SK begins_with 'STUDENT#' })

This creates two items per relationship but enables efficient querying in both directions.

Question 3: Explain the difference between normalization in SQL and denormalization in DynamoDB.

Answer: SQL Normalization:

Eliminates redundancy by splitting data into multiple tables
Uses JOINs to recombine data at query time
Optimizes for write efficiency and consistency
Flexible querying but variable performance

DynamoDB Denormalization:

Duplicates data across items to avoid JOINs
Pre-joins data at write time
Optimizes for read efficiency and predictable performance
Fixed access patterns but consistent latency

Example:

// SQL (normalized)
users: { id: 1, name: 'John' }
orders: { id: 100, userId: 1, total: 99 }
// Requires JOIN to get user with orders

// DynamoDB (denormalized)
{
  PK: 'USER#1',
  SK: 'ORDER#100',
  userName: 'John',      // Denormalized
  orderId: '100',
  total: 99
}
// Single query gets user data with orders

Question 4: How do you avoid hot partitions in DynamoDB?

Answer: Strategies:

Add Random Suffix:

const shardId = Math.floor(Math.random() * 100);
PK: `COUNTER#${shardId}`

Use High-Cardinality Partition Keys:

// BAD: Low cardinality
PK: `STATUS#${status}`  // Only a few values

// GOOD: High cardinality
PK: `USER#${userId}`    // Millions of values

Date-Based Partitioning:

PK: `METRICS#${date}#${randomSuffix}`

Composite Keys:

PK: `CATEGORY#${category}#BRAND#${brand}`

Write Sharding:

const shard = itemId % 10;
PK: `SHARD#${shard}#ITEM#${itemId}`

Question 5: What are the trade-offs between single-table and multi-table design?

Answer: Single-Table Design: Advantages:

Fewer requests (lower latency)
Transaction support across entities
Simpler operations
Lower costs

Disadvantages:

Requires known access patterns upfront
More complex schema
Harder to understand
Limited ad-hoc queries

Multi-Table Design: Advantages:

Clearer data organization
Independent scaling per table
Separate IAM policies
Easier to understand

Disadvantages:

More network requests
No transactions across tables
Higher operational overhead
Potentially higher costs

Decision criteria:

Use single-table when entities are tightly coupled
Use multi-table when entities have different requirements
Consider team structure and ownership
Evaluate security and compliance needs

Question 6: How do you handle hierarchical data in DynamoDB?

Answer: Use composite sort keys with hierarchy encoded in the SK:

// Organization hierarchy: Org -> Dept -> Team -> Employee

{
  PK: 'ORG#ACME',
  SK: 'METADATA',
  Type: 'Organization'
}

{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering',
  Type: 'Department'
}

{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering#TEAM#Backend',
  Type: 'Team'
}

{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering#TEAM#Backend#EMP#123',
  Type: 'Employee'
}

// Query all levels:
// All depts: begins_with(SK, 'DEPT#')
// All teams in dept: begins_with(SK, 'DEPT#Engineering#TEAM#')
// All employees in team: begins_with(SK, 'DEPT#Engineering#TEAM#Backend#EMP#')

Benefits:

Single query retrieves any level of hierarchy
Natural ordering
Efficient range queries
No recursive queries needed

Question 7: Describe how to model time-series data efficiently in DynamoDB.

Answer: Key strategies:

Time-based partition keys:

// Hour-based partitioning
{
  PK: 'SENSOR#S123#HOUR#2024-01-15-10',
  SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z',
  temperature: 72.5
}

GSI for cross-partition queries:

{
  PK: 'SENSOR#S123#HOUR#2024-01-15-10',
  SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z',
  GSI1PK: 'SENSOR#S123',
  GSI1SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z'
}

Aggregations with TTL:

{
  PK: 'SENSOR#S123#AGG#HOURLY',
  SK: 'HOUR#2024-01-15-10',
  avgTemp: 72.3,
  minTemp: 70.1,
  maxTemp: 74.5,
  sampleCount: 120,
  ttl: 1737849600  // Expire old aggregations
}

Cold data archival:

Use DynamoDB Streams to archive to S3
Keep recent data in DynamoDB
Query S3 for historical analysis

Question 8: How would you migrate from a relational database to DynamoDB?

Answer: Migration steps:

Document access patterns:

Current SQL queries:
- SELECT * FROM users WHERE email = ?
- SELECT * FROM orders WHERE userId = ? ORDER BY date DESC
- SELECT oi.* FROM order_items oi JOIN orders o ON...

Design DynamoDB schema:

// Map to DynamoDB access patterns
{
  PK: 'USER#123',
  SK: 'METADATA',
  GSI1PK: 'EMAIL#[email protected]'
}

{
  PK: 'USER#123',
  SK: 'ORDER#2024-01-15#001'
}

Choose migration approach:

Gradual: Dual writes, gradual read cutover
Big bang: Full migration at once
Shadow mode: Run in parallel, compare results

Implementation:

// Dual write example
async function createOrder(order) {
  // Write to both systems
  await sqlDb.insert(order);
  await dynamodb.putItem(transformToDynamo(order));
}

Validation:

Compare query results
Monitor error rates
Performance testing

Cutover:

Switch reads to DynamoDB
Stop SQL writes
Archive SQL data

Question 9: What is the adjacency list pattern?

Answer: The adjacency list pattern stores different entity types together using PK/SK to create relationships. It enables querying related items in a single request. Example - Social Network:

// User metadata
{
  PK: 'USER#alice',
  SK: 'METADATA',
  Type: 'User',
  name: 'Alice',
  email: '[email protected]'
}

// Alice's friend Bob
{
  PK: 'USER#alice',
  SK: 'FRIEND#bob',
  Type: 'Friendship',
  friendId: 'bob',
  friendName: 'Bob',
  since: '2024-01-01'
}

// Alice's post
{
  PK: 'USER#alice',
  SK: 'POST#2024-01-15#001',
  Type: 'Post',
  postId: '001',
  content: 'Hello world!',
  timestamp: '2024-01-15T10:00:00Z'
}

// Query gets user + friends + posts in one request
query({ PK: 'USER#alice' })

Benefits:

Single query for related data
Efficient 1-to-many relationships
Natural grouping of related items

Question 10: How do you handle schema evolution in DynamoDB?

Answer: Strategies:

Schema versioning:

{
  PK: 'USER#123',
  SK: 'METADATA',
  schemaVersion: '2.0',
  // v2 fields
  name: 'John Doe',
  email: '[email protected]'
}

// Handle in application
function parseUser(item) {
  if (item.schemaVersion === '1.0') {
    return migrateV1ToV2(item);
  }
  return item;
}

Additive changes:

// Just add new attributes
{
  PK: 'USER#123',
  SK: 'METADATA',
  name: 'John Doe',
  // New field added without breaking old code
  phoneNumber: '+1234567890'
}

Lazy migration:

async function getUser(userId) {
  const item = await dynamodb.get({...});

  if (needsMigration(item)) {
    const migrated = migrate(item);
    await dynamodb.put(migrated);
    return migrated;
  }

  return item;
}

GSI for new access patterns:

// Add GSI without changing existing items
{
  // Existing attributes
  PK: 'USER#123',
  SK: 'METADATA',

  // Add GSI keys for new pattern
  GSI2PK: 'COUNTRY#US',
  GSI2SK: 'USER#123'
}

Summary

Data modeling in DynamoDB requires a fundamental shift from traditional relational thinking: Key Principles:

Access patterns drive design
Denormalize for performance
Use composite keys for flexibility
Leverage GSIs for alternate access patterns
Pre-compute and store aggregations

Best Practices:

Document all access patterns upfront
Prefer single-table design for related entities
Avoid hot partitions through sharding
Use sparse indexes efficiently
Version your schema for evolution
Test with realistic data volumes

Common Patterns:

Hierarchical data with composite sort keys
Many-to-many with adjacency lists
Time-series with date-based partitioning
Denormalized attributes for efficiency

Effective DynamoDB data modeling enables predictable performance, cost efficiency, and scalability at any scale.

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Chapter 3: Data Modeling in DynamoDB

​Introduction

​Understanding NoSQL Data Modeling

​The Paradigm Shift

​Key Principles

​Single-Table Design

​What is Single-Table Design?

​Advantages of Single-Table Design

​Single-Table Design Example

​Querying the Single-Table Design

​Advanced Data Modeling Patterns

​Pattern 1: Hierarchical Data

​Pattern 2: Many-to-Many Relationships

​Pattern 3: Time-Series Data

​Pattern 4: Composite Sort Key Patterns

​Pattern 5: Sparse Indexes

​Data Modeling Best Practices

​1. Avoid Hot Partitions

​2. Pre-compute and Denormalize

​3. Use Item Collections Wisely

​4. Version Your Schema

​5. Implement Soft Deletes

​Multi-Table vs Single-Table Design

​When to Use Multiple Tables

​Common Anti-Patterns to Avoid

​Anti-Pattern 1: Normalization

​Anti-Pattern 2: Using Scan for Queries

​Anti-Pattern 3: Large Item Sizes

​Anti-Pattern 4: Not Using Sort Keys

​Data Modeling Workflow

​Migration Strategies

​Migrating from SQL to DynamoDB

​Performance Optimization Through Data Modeling

​1. Reduce Item Size

​2. Batch Related Queries

​3. Use Projection Expressions

​Interview Questions and Answers

​Question 1: What is single-table design and when should you use it?