Skip to main content

Chapter 3: Data Modeling in DynamoDB

Introduction

Data modeling in DynamoDB represents a fundamental paradigm shift from traditional relational database design. Rather than normalizing data and using joins, DynamoDB requires pre-joining data and denormalizing based on access patterns. This chapter explores the principles, patterns, and best practices for effective DynamoDB data modeling.

Understanding NoSQL Data Modeling

The Paradigm Shift

Traditional relational databases follow a schema-first approach where you design normalized tables and then write queries. DynamoDB inverts this model:
  1. Identify Access Patterns First: Understand all the ways your application needs to read and write data
  2. Design for Access Patterns: Structure your data to support these patterns efficiently
  3. Denormalize and Pre-Join: Store data in a way that minimizes the number of requests needed
<svg viewBox="0 0 800 500" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="400" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    Relational vs DynamoDB Modeling Approach
  </text>

  <!-- Relational Side -->
  <rect x="50" y="60" width="300" height="400" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
  <text x="200" y="90" font-size="16" font-weight="bold" text-anchor="middle" fill="#1976d2">
    Relational (SQL)
  </text>

  <!-- Step 1 -->
  <rect x="70" y="110" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="200" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    1. Design Schema
  </text>
  <text x="200" y="155" font-size="12" text-anchor="middle" fill="#666">
    Normalize tables, define relationships
  </text>

  <!-- Arrow -->
  <path d="M 200 170 L 200 190" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 2 -->
  <rect x="70" y="190" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="200" y="215" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    2. Write Queries
  </text>
  <text x="200" y="235" font-size="12" text-anchor="middle" fill="#666">
    Use JOINs to combine data
  </text>

  <!-- Arrow -->
  <path d="M 200 250 L 200 270" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 3 -->
  <rect x="70" y="270" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="200" y="295" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    3. Add Indexes
  </text>
  <text x="200" y="315" font-size="12" text-anchor="middle" fill="#666">
    Optimize slow queries
  </text>

  <!-- Arrow -->
  <path d="M 200 330 L 200 350" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Result -->
  <ellipse cx="200" cy="390" rx="120" ry="40" fill="#4caf50" stroke="#2e7d32" stroke-width="2"/>
  <text x="200" y="395" font-size="13" font-weight="bold" text-anchor="middle" fill="#fff">
    Flexible Queries
  </text>
  <text x="200" y="410" font-size="11" text-anchor="middle" fill="#fff">
    Variable Performance
  </text>

  <!-- DynamoDB Side -->
  <rect x="450" y="60" width="300" height="400" fill="#fff3e0" stroke="#f57c00" stroke-width="2" rx="5"/>
  <text x="600" y="90" font-size="16" font-weight="bold" text-anchor="middle" fill="#f57c00">
    DynamoDB (NoSQL)
  </text>

  <!-- Step 1 -->
  <rect x="470" y="110" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="600" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    1. Identify Access Patterns
  </text>
  <text x="600" y="155" font-size="12" text-anchor="middle" fill="#666">
    List all query requirements
  </text>

  <!-- Arrow -->
  <path d="M 600 170 L 600 190" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 2 -->
  <rect x="470" y="190" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="600" y="215" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    2. Design Table Structure
  </text>
  <text x="600" y="235" font-size="12" text-anchor="middle" fill="#666">
    Denormalize, pre-join data
  </text>

  <!-- Arrow -->
  <path d="M 600 250 L 600 270" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 3 -->
  <rect x="470" y="270" width="260" height="60" fill="#fff" stroke="#666" stroke-width="1" rx="3"/>
  <text x="600" y="295" font-size="14" font-weight="bold" text-anchor="middle" fill="#333">
    3. Validate All Patterns
  </text>
  <text x="600" y="315" font-size="12" text-anchor="middle" fill="#666">
    Ensure efficient access
  </text>

  <!-- Arrow -->
  <path d="M 600 330 L 600 350" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Result -->
  <ellipse cx="600" cy="390" rx="120" ry="40" fill="#4caf50" stroke="#2e7d32" stroke-width="2"/>
  <text x="600" y="395" font-size="13" font-weight="bold" text-anchor="middle" fill="#fff">
    Fixed Access Patterns
  </text>
  <text x="600" y="410" font-size="11" text-anchor="middle" fill="#fff">
    Predictable Performance
  </text>

  <!-- Arrow marker definition -->
  <defs>
    <marker id="arrowhead" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
      <polygon points="0 0, 10 3, 0 6" fill="#666"/>
    </marker>
  </defs>
</svg>

Key Principles

Principle 1: Understand Your Access Patterns Before designing your data model, document every way your application needs to access data:
Example: E-commerce Application Access Patterns

1. Get user profile by userId
2. Get all orders for a user (sorted by date)
3. Get order details by orderId
4. Get all items in an order
5. Get product details by productId
6. Get all products in a category
7. Get product inventory by warehouseId
8. Get user's shopping cart
9. Get order status history
10. Search orders by status
Principle 2: Minimize Request Count Design your table to retrieve related data in a single request:
// ANTI-PATTERN: Multiple requests (SQL approach)
const user = await getUser(userId);
const orders = await getOrders(userId);
const addresses = await getAddresses(userId);

// GOOD PATTERN: Single request (DynamoDB approach)
const userProfile = await query({
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': `USER#${userId}`
  }
});
// Returns: user metadata, orders, addresses in one response
Principle 3: Use Composite Keys Partition keys (PK) and sort keys (SK) should be composite values that enable efficient queries:
// Composite key examples
PK: 'USER#12345'
SK: 'ORDER#2024-01-15#54321'

PK: 'PRODUCT#98765'
SK: 'REVIEW#2024-01-10#USER#12345'

PK: 'ORG#ACME'
SK: 'EMPLOYEE#DEPT#ENGINEERING#EMP#67890'

Single-Table Design

What is Single-Table Design?

Single-table design is a DynamoDB modeling technique where you store multiple entity types in one table, using generic attribute names (PK, SK, GSI1PK, GSI1SK) and item type identifiers.
<svg viewBox="0 0 900 600" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="450" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    Single-Table Design Structure
  </text>

  <!-- Main Table -->
  <rect x="50" y="60" width="800" height="500" fill="#f5f5f5" stroke="#333" stroke-width="2" rx="5"/>
  <text x="450" y="90" font-size="16" font-weight="bold" text-anchor="middle" fill="#333">
    Application Table
  </text>

  <!-- Table Header -->
  <rect x="70" y="110" width="760" height="40" fill="#1976d2" stroke="#0d47a1" stroke-width="1"/>
  <text x="150" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">PK</text>
  <text x="300" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">SK</text>
  <text x="450" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">Type</text>
  <text x="600" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">Attributes</text>
  <text x="750" y="135" font-size="14" font-weight="bold" text-anchor="middle" fill="#fff">GSI1PK</text>

  <!-- User Entity -->
  <rect x="70" y="150" width="760" height="35" fill="#e3f2fd" stroke="#90caf9" stroke-width="1"/>
  <text x="150" y="172" font-size="12" text-anchor="middle" fill="#333">USER#123</text>
  <text x="300" y="172" font-size="12" text-anchor="middle" fill="#333">METADATA</text>
  <text x="450" y="172" font-size="12" text-anchor="middle" fill="#333">User</text>
  <text x="600" y="172" font-size="11" text-anchor="middle" fill="#666">name, email...</text>
  <text x="750" y="172" font-size="12" text-anchor="middle" fill="#333">EMAIL#[email protected]</text>

  <!-- Order Entity -->
  <rect x="70" y="185" width="760" height="35" fill="#fff3e0" stroke="#ffb74d" stroke-width="1"/>
  <text x="150" y="207" font-size="12" text-anchor="middle" fill="#333">USER#123</text>
  <text x="300" y="207" font-size="12" text-anchor="middle" fill="#333">ORDER#2024-01-15</text>
  <text x="450" y="207" font-size="12" text-anchor="middle" fill="#333">Order</text>
  <text x="600" y="207" font-size="11" text-anchor="middle" fill="#666">total, status...</text>
  <text x="750" y="207" font-size="12" text-anchor="middle" fill="#333">STATUS#PENDING</text>

  <!-- Order Items -->
  <rect x="70" y="220" width="760" height="35" fill="#fff3e0" stroke="#ffb74d" stroke-width="1"/>
  <text x="150" y="242" font-size="12" text-anchor="middle" fill="#333">ORDER#001</text>
  <text x="300" y="242" font-size="12" text-anchor="middle" fill="#333">ITEM#PROD#456</text>
  <text x="450" y="242" font-size="12" text-anchor="middle" fill="#333">OrderItem</text>
  <text x="600" y="242" font-size="11" text-anchor="middle" fill="#666">qty, price...</text>
  <text x="750" y="242" font-size="12" text-anchor="middle" fill="#333">PRODUCT#456</text>

  <!-- Product Entity -->
  <rect x="70" y="255" width="760" height="35" fill="#e8f5e9" stroke="#81c784" stroke-width="1"/>
  <text x="150" y="277" font-size="12" text-anchor="middle" fill="#333">PRODUCT#456</text>
  <text x="300" y="277" font-size="12" text-anchor="middle" fill="#333">METADATA</text>
  <text x="450" y="277" font-size="12" text-anchor="middle" fill="#333">Product</text>
  <text x="600" y="277" font-size="11" text-anchor="middle" fill="#666">name, desc...</text>
  <text x="750" y="277" font-size="12" text-anchor="middle" fill="#333">CATEGORY#BOOKS</text>

  <!-- Review Entity -->
  <rect x="70" y="290" width="760" height="35" fill="#e8f5e9" stroke="#81c784" stroke-width="1"/>
  <text x="150" y="312" font-size="12" text-anchor="middle" fill="#333">PRODUCT#456</text>
  <text x="300" y="312" font-size="12" text-anchor="middle" fill="#333">REVIEW#USER#123</text>
  <text x="450" y="312" font-size="12" text-anchor="middle" fill="#333">Review</text>
  <text x="600" y="312" font-size="11" text-anchor="middle" fill="#666">rating, text...</text>
  <text x="750" y="312" font-size="12" text-anchor="middle" fill="#333">USER#123</text>

  <!-- More entities indicator -->
  <text x="450" y="345" font-size="14" text-anchor="middle" fill="#666" font-style="italic">
    ... more entities ...
  </text>

  <!-- Benefits Box -->
  <rect x="90" y="380" width="340" height="150" fill="#fff" stroke="#4caf50" stroke-width="2" rx="5"/>
  <text x="260" y="410" font-size="15" font-weight="bold" text-anchor="middle" fill="#4caf50">
    Benefits
  </text>
  <text x="110" y="440" font-size="12" fill="#333">
    ✓ Single table = simpler operations
  </text>
  <text x="110" y="465" font-size="12" fill="#333">
    ✓ Related data in one partition
  </text>
  <text x="110" y="490" font-size="12" fill="#333">
    ✓ Atomic transactions possible
  </text>
  <text x="110" y="515" font-size="12" fill="#333">
    ✓ Fewer network round trips
  </text>

  <!-- Challenges Box -->
  <rect x="470" y="380" width="340" height="150" fill="#fff" stroke="#ff9800" stroke-width="2" rx="5"/>
  <text x="640" y="410" font-size="15" font-weight="bold" text-anchor="middle" fill="#ff9800">
    Challenges
  </text>
  <text x="490" y="440" font-size="12" fill="#333">
    ⚠ Requires upfront access patterns
  </text>
  <text x="490" y="465" font-size="12" fill="#333">
    ⚠ Schema less obvious
  </text>
  <text x="490" y="490" font-size="12" fill="#333">
    ⚠ Harder to understand initially
  </text>
  <text x="490" y="515" font-size="12" fill="#333">
    ⚠ Limited ad-hoc queries
  </text>
</svg>

Advantages of Single-Table Design

  1. Reduced Latency: Fetch related items in one request
  2. Atomic Operations: Transaction support across entity types
  3. Cost Efficiency: Fewer requests = lower costs
  4. Simplified Operations: Manage one table instead of many

Single-Table Design Example

Let’s model an e-commerce application with users, orders, products, and reviews:
// Entity Relationships
User (1) ----< (N) Orders
Order (1) ----< (N) OrderItems
Product (1) ----< (N) Reviews
Product (1) ----< (N) OrderItems

// Table Design
{
  TableName: 'EcommerceApp',
  KeySchema: [
    { AttributeName: 'PK', KeyType: 'HASH' },
    { AttributeName: 'SK', KeyType: 'RANGE' }
  ],
  AttributeDefinitions: [
    { AttributeName: 'PK', AttributeType: 'S' },
    { AttributeName: 'SK', AttributeType: 'S' },
    { AttributeName: 'GSI1PK', AttributeType: 'S' },
    { AttributeName: 'GSI1SK', AttributeType: 'S' }
  ],
  GlobalSecondaryIndexes: [
    {
      IndexName: 'GSI1',
      KeySchema: [
        { AttributeName: 'GSI1PK', KeyType: 'HASH' },
        { AttributeName: 'GSI1SK', KeyType: 'RANGE' }
      ],
      Projection: { ProjectionType: 'ALL' }
    }
  ]
}

// Sample Items

// User
{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  userId: '12345',
  name: 'John Doe',
  email: '[email protected]',
  createdAt: '2024-01-01T00:00:00Z',
  GSI1PK: 'EMAIL#[email protected]',
  GSI1SK: 'USER#12345'
}

// User's Order
{
  PK: 'USER#12345',
  SK: 'ORDER#2024-01-15#ORD-001',
  Type: 'Order',
  orderId: 'ORD-001',
  userId: '12345',
  orderDate: '2024-01-15T10:30:00Z',
  total: 99.99,
  status: 'DELIVERED',
  GSI1PK: 'STATUS#DELIVERED',
  GSI1SK: '2024-01-15#ORD-001'
}

// Order Details (duplicated for efficient access)
{
  PK: 'ORDER#ORD-001',
  SK: 'METADATA',
  Type: 'Order',
  orderId: 'ORD-001',
  userId: '12345',
  orderDate: '2024-01-15T10:30:00Z',
  total: 99.99,
  status: 'DELIVERED',
  shippingAddress: '123 Main St',
  GSI1PK: 'USER#12345',
  GSI1SK: 'ORDER#2024-01-15#ORD-001'
}

// Order Item
{
  PK: 'ORDER#ORD-001',
  SK: 'ITEM#PRODUCT#PROD-789',
  Type: 'OrderItem',
  orderId: 'ORD-001',
  productId: 'PROD-789',
  productName: 'Wireless Mouse',
  quantity: 2,
  price: 29.99,
  GSI1PK: 'PRODUCT#PROD-789',
  GSI1SK: 'ORDER#ORD-001'
}

// Product
{
  PK: 'PRODUCT#PROD-789',
  SK: 'METADATA',
  Type: 'Product',
  productId: 'PROD-789',
  name: 'Wireless Mouse',
  description: 'Ergonomic wireless mouse',
  price: 29.99,
  category: 'Electronics',
  inventory: 150,
  GSI1PK: 'CATEGORY#Electronics',
  GSI1SK: 'PRODUCT#PROD-789'
}

// Product Review
{
  PK: 'PRODUCT#PROD-789',
  SK: 'REVIEW#2024-01-16#USER#12345',
  Type: 'Review',
  productId: 'PROD-789',
  userId: '12345',
  userName: 'John Doe',
  rating: 5,
  reviewText: 'Great mouse!',
  reviewDate: '2024-01-16T15:20:00Z',
  GSI1PK: 'USER#12345',
  GSI1SK: 'REVIEW#2024-01-16'
}

Querying the Single-Table Design

// Access Pattern 1: Get user profile
const getUserProfile = async (userId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    KeyConditionExpression: 'PK = :pk AND SK = :sk',
    ExpressionAttributeValues: {
      ':pk': `USER#${userId}`,
      ':sk': 'METADATA'
    }
  });
};

// Access Pattern 2: Get all orders for a user (sorted by date)
const getUserOrders = async (userId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
    ExpressionAttributeValues: {
      ':pk': `USER#${userId}`,
      ':sk': 'ORDER#'
    },
    ScanIndexForward: false // Most recent first
  });
};

// Access Pattern 3: Get order with all items
const getOrderDetails = async (orderId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    KeyConditionExpression: 'PK = :pk',
    ExpressionAttributeValues: {
      ':pk': `ORDER#${orderId}`
    }
  });
  // Returns order metadata + all order items in one query
};

// Access Pattern 4: Get product with reviews
const getProductWithReviews = async (productId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    KeyConditionExpression: 'PK = :pk',
    ExpressionAttributeValues: {
      ':pk': `PRODUCT#${productId}`
    }
  });
  // Returns product metadata + all reviews in one query
};

// Access Pattern 5: Get products by category (using GSI)
const getProductsByCategory = async (category) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    IndexName: 'GSI1',
    KeyConditionExpression: 'GSI1PK = :gsi1pk',
    ExpressionAttributeValues: {
      ':gsi1pk': `CATEGORY#${category}`
    }
  });
};

// Access Pattern 6: Get orders by status (using GSI)
const getOrdersByStatus = async (status) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    IndexName: 'GSI1',
    KeyConditionExpression: 'GSI1PK = :gsi1pk',
    ExpressionAttributeValues: {
      ':gsi1pk': `STATUS#${status}`
    },
    ScanIndexForward: false // Most recent first
  });
};

// Access Pattern 7: Get user by email (using GSI)
const getUserByEmail = async (email) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    IndexName: 'GSI1',
    KeyConditionExpression: 'GSI1PK = :gsi1pk',
    ExpressionAttributeValues: {
      ':gsi1pk': `EMAIL#${email}`
    }
  });
};

// Access Pattern 8: Get user's reviews (using GSI)
const getUserReviews = async (userId) => {
  return await dynamodb.query({
    TableName: 'EcommerceApp',
    IndexName: 'GSI1',
    KeyConditionExpression: 'GSI1PK = :gsi1pk AND begins_with(GSI1SK, :gsi1sk)',
    ExpressionAttributeValues: {
      ':gsi1pk': `USER#${userId}`,
      ':gsi1sk': 'REVIEW#'
    },
    ScanIndexForward: false
  });
};

Advanced Data Modeling Patterns

Pattern 1: Hierarchical Data

Model hierarchical relationships using composite sort keys:
// Organization Structure: Org -> Departments -> Teams -> Employees

// Organization
{
  PK: 'ORG#ACME',
  SK: 'METADATA',
  Type: 'Organization',
  name: 'ACME Corp',
  industry: 'Technology'
}

// Department
{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering',
  Type: 'Department',
  deptName: 'Engineering',
  budget: 5000000
}

// Team
{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering#TEAM#Backend',
  Type: 'Team',
  teamName: 'Backend Team',
  deptName: 'Engineering'
}

// Employee
{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering#TEAM#Backend#EMP#12345',
  Type: 'Employee',
  employeeId: '12345',
  name: 'Alice Johnson',
  role: 'Senior Engineer',
  GSI1PK: 'EMPLOYEE#12345',
  GSI1SK: 'METADATA'
}

// Query all departments
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'ORG#ACME',
    ':sk': 'DEPT#'
  }
});

// Query all teams in Engineering
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'ORG#ACME',
    ':sk': 'DEPT#Engineering#TEAM#'
  }
});

// Query all employees in Backend team
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'ORG#ACME',
    ':sk': 'DEPT#Engineering#TEAM#Backend#EMP#'
  }
});

Pattern 2: Many-to-Many Relationships

Use adjacency list pattern for many-to-many relationships:
// Students and Classes (many-to-many)

// Student
{
  PK: 'STUDENT#S001',
  SK: 'METADATA',
  Type: 'Student',
  studentId: 'S001',
  name: 'John Smith',
  email: '[email protected]'
}

// Student -> Class relationship
{
  PK: 'STUDENT#S001',
  SK: 'CLASS#C101',
  Type: 'Enrollment',
  studentId: 'S001',
  classId: 'C101',
  className: 'Database Systems',
  enrolledDate: '2024-01-10',
  grade: null,
  GSI1PK: 'CLASS#C101',
  GSI1SK: 'STUDENT#S001'
}

// Class
{
  PK: 'CLASS#C101',
  SK: 'METADATA',
  Type: 'Class',
  classId: 'C101',
  name: 'Database Systems',
  instructor: 'Dr. Smith',
  capacity: 30
}

// Class -> Student relationship (inverse)
{
  PK: 'CLASS#C101',
  SK: 'STUDENT#S001',
  Type: 'Enrollment',
  studentId: 'S001',
  studentName: 'John Smith',
  classId: 'C101',
  enrolledDate: '2024-01-10',
  grade: null,
  GSI1PK: 'STUDENT#S001',
  GSI1SK: 'CLASS#C101'
}

// Get all classes for a student
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'STUDENT#S001',
    ':sk': 'CLASS#'
  }
});

// Get all students in a class
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'CLASS#C101',
    ':sk': 'STUDENT#'
  }
});

Pattern 3: Time-Series Data

Efficiently store and query time-series data:
// IoT Sensor Data

// Strategy 1: Date-based partitioning
{
  PK: 'SENSOR#S123#DATE#2024-01-15',
  SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z',
  Type: 'SensorReading',
  sensorId: 'S123',
  temperature: 72.5,
  humidity: 45.2,
  pressure: 1013.25,
  timestamp: '2024-01-15T10:30:45.123Z'
}

// Strategy 2: Hour-based partitioning (more granular)
{
  PK: 'SENSOR#S123#HOUR#2024-01-15-10',
  SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z',
  Type: 'SensorReading',
  sensorId: 'S123',
  temperature: 72.5,
  timestamp: '2024-01-15T10:30:45.123Z',
  GSI1PK: 'SENSOR#S123',
  GSI1SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z'
}

// Query readings for a specific hour
dynamodb.query({
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': 'SENSOR#S123#HOUR#2024-01-15-10'
  }
});

// Query readings in a time range (using GSI)
dynamodb.query({
  IndexName: 'GSI1',
  KeyConditionExpression: 'GSI1PK = :pk AND GSI1SK BETWEEN :start AND :end',
  ExpressionAttributeValues: {
    ':pk': 'SENSOR#S123',
    ':start': 'TIMESTAMP#2024-01-15T09:00:00.000Z',
    ':end': 'TIMESTAMP#2024-01-15T11:00:00.000Z'
  }
});

// Strategy 3: Aggregated data with TTL
{
  PK: 'SENSOR#S123#AGG#HOURLY',
  SK: 'HOUR#2024-01-15-10',
  Type: 'AggregatedReading',
  sensorId: 'S123',
  hour: '2024-01-15-10',
  avgTemperature: 72.3,
  minTemperature: 70.1,
  maxTemperature: 74.5,
  readingCount: 120,
  ttl: 1737849600 // Expire after 30 days
}

Pattern 4: Composite Sort Key Patterns

Use composite sort keys to enable multiple query patterns:
// E-commerce Product Catalog

// Product with composite SK for filtering
{
  PK: 'CATEGORY#Electronics',
  SK: 'BRAND#Apple#PRICE#0999.99#PRODUCT#iPhone15',
  Type: 'Product',
  productId: 'iPhone15',
  name: 'iPhone 15',
  brand: 'Apple',
  price: 999.99,
  rating: 4.8,
  inStock: true,
  GSI1PK: 'PRODUCT#iPhone15',
  GSI1SK: 'METADATA'
}

// Query products by category and brand
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND begins_with(SK, :sk)',
  ExpressionAttributeValues: {
    ':pk': 'CATEGORY#Electronics',
    ':sk': 'BRAND#Apple#'
  }
});

// Query products by category, brand, and price range
dynamodb.query({
  KeyConditionExpression: 'PK = :pk AND SK BETWEEN :start AND :end',
  ExpressionAttributeValues: {
    ':pk': 'CATEGORY#Electronics',
    ':start': 'BRAND#Apple#PRICE#0500.00',
    ':end': 'BRAND#Apple#PRICE#1000.00'
  }
});

Pattern 5: Sparse Indexes

Use sparse GSIs to index only items with specific attributes:
// Only create GSI keys for items that need to be queried

// Regular user (no special attributes)
{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  name: 'John Doe',
  email: '[email protected]'
  // No GSI1PK/SK - not indexed in GSI1
}

// Premium user (indexed for premium queries)
{
  PK: 'USER#67890',
  SK: 'METADATA',
  Type: 'User',
  name: 'Jane Smith',
  email: '[email protected]',
  isPremium: true,
  premiumSince: '2024-01-01',
  GSI1PK: 'PREMIUM_USERS',
  GSI1SK: '2024-01-01#USER#67890'
}

// Query only premium users (sparse index)
dynamodb.query({
  IndexName: 'GSI1',
  KeyConditionExpression: 'GSI1PK = :pk',
  ExpressionAttributeValues: {
    ':pk': 'PREMIUM_USERS'
  }
});

Data Modeling Best Practices

1. Avoid Hot Partitions

Distribute writes evenly across partitions:
// BAD: All writes go to same partition
{
  PK: 'GLOBAL_COUNTER',  // Hot partition!
  SK: 'COUNT',
  value: 1000000
}

// GOOD: Distribute using random suffix
const shardId = Math.floor(Math.random() * 10);
{
  PK: `COUNTER_SHARD#${shardId}`,
  SK: 'COUNT',
  value: 100000
}
// Read all shards and sum to get total

// GOOD: Use timestamp-based partitioning
{
  PK: `METRICS#${new Date().toISOString().split('T')[0]}`,
  SK: `TIMESTAMP#${Date.now()}`,
  metric: 'page_view'
}

2. Pre-compute and Denormalize

Store computed values to avoid expensive queries:
// Store aggregated data with detail data

// Order with pre-computed totals
{
  PK: 'ORDER#ORD-001',
  SK: 'METADATA',
  Type: 'Order',
  orderId: 'ORD-001',
  itemCount: 3,           // Pre-computed
  subtotal: 89.97,        // Pre-computed
  tax: 7.20,              // Pre-computed
  total: 97.17,           // Pre-computed
  status: 'COMPLETED'
}

// User with denormalized frequently-accessed data
{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  userId: '12345',
  name: 'John Doe',
  email: '[email protected]',
  orderCount: 25,         // Denormalized
  totalSpent: 2500.00,    // Denormalized
  lastOrderDate: '2024-01-15',  // Denormalized
  memberSince: '2023-01-01'
}

3. Use Item Collections Wisely

Keep related items together for efficient queries:
// Good: Related items share partition key
{
  PK: 'THREAD#T001',  // Discussion thread
  SK: 'METADATA',
  threadTitle: 'How to use DynamoDB?',
  createdBy: 'USER#123',
  createdAt: '2024-01-10'
}

{
  PK: 'THREAD#T001',
  SK: 'POST#2024-01-10T10:00:00#P001',
  postId: 'P001',
  author: 'USER#123',
  content: 'Original post content',
  timestamp: '2024-01-10T10:00:00Z'
}

{
  PK: 'THREAD#T001',
  SK: 'POST#2024-01-10T10:15:00#P002',
  postId: 'P002',
  author: 'USER#456',
  content: 'Reply to post',
  timestamp: '2024-01-10T10:15:00Z'
}

// Single query gets thread + all posts
dynamodb.query({
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': 'THREAD#T001'
  }
});

4. Version Your Schema

Include schema version for future migrations:
{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  SchemaVersion: '2.0',  // Track schema version
  userId: '12345',
  name: 'John Doe',
  // ... other attributes
}

// Handle different versions in application code
const handleUser = (item) => {
  switch (item.SchemaVersion) {
    case '1.0':
      return migrateFromV1(item);
    case '2.0':
      return item;
    default:
      throw new Error('Unknown schema version');
  }
};

5. Implement Soft Deletes

Use status flags instead of deleting items:
// Instead of deleting
{
  PK: 'USER#12345',
  SK: 'METADATA',
  Type: 'User',
  userId: '12345',
  name: 'John Doe',
  status: 'DELETED',           // Soft delete flag
  deletedAt: '2024-01-20',
  ttl: 1740787200              // Hard delete after 90 days
}

// Filter in queries
dynamodb.query({
  KeyConditionExpression: 'PK = :pk',
  FilterExpression: 'attribute_not_exists(#status) OR #status <> :deleted',
  ExpressionAttributeNames: {
    '#status': 'status'
  },
  ExpressionAttributeValues: {
    ':pk': 'USER#12345',
    ':deleted': 'DELETED'
  }
});

Multi-Table vs Single-Table Design

When to Use Multiple Tables

Single-table design isn’t always the answer. Consider multiple tables when:
  1. Different Access Patterns: Entities have completely different access patterns
  2. Different Scaling Requirements: Some entities need different throughput
  3. Different Security Requirements: Different IAM policies per entity type
  4. Different Backup Requirements: Different backup/restore needs
  5. Team Boundaries: Different teams manage different entities
// Multi-table example: Different requirements

// Users Table (high read, low write)
{
  TableName: 'Users',
  BillingMode: 'PAY_PER_REQUEST',
  PointInTimeRecoveryEnabled: true
}

// Sessions Table (high read/write, TTL enabled)
{
  TableName: 'Sessions',
  BillingMode: 'PROVISIONED',
  ProvisionedThroughput: {
    ReadCapacityUnits: 5000,
    WriteCapacityUnits: 5000
  },
  TimeToLiveSpecification: {
    Enabled: true,
    AttributeName: 'ttl'
  }
}

// Analytics Table (write-heavy, different backup schedule)
{
  TableName: 'Analytics',
  BillingMode: 'PAY_PER_REQUEST',
  StreamSpecification: {
    StreamEnabled: true,
    StreamViewType: 'NEW_IMAGE'
  }
}

Common Anti-Patterns to Avoid

Anti-Pattern 1: Normalization

// WRONG: Normalized like SQL
// Users table
{ userId: '123', name: 'John' }

// Orders table
{ orderId: '001', userId: '123', total: 99.99 }

// OrderItems table
{ orderItemId: '1', orderId: '001', productId: '789' }

// Requires 3 queries to get order with items for a user!

// RIGHT: Denormalized
{
  PK: 'USER#123',
  SK: 'ORDER#001',
  orderId: '001',
  total: 99.99,
  items: [
    { productId: '789', quantity: 2, price: 49.99 }
  ]
}
// Single query gets everything

Anti-Pattern 2: Using Scan for Queries

// WRONG: Scanning entire table
const results = await dynamodb.scan({
  TableName: 'Products',
  FilterExpression: 'category = :cat',
  ExpressionAttributeValues: {
    ':cat': 'Electronics'
  }
});

// RIGHT: Query with proper key design
const results = await dynamodb.query({
  TableName: 'Products',
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': 'CATEGORY#Electronics'
  }
});

Anti-Pattern 3: Large Item Sizes

// WRONG: Storing large arrays (approaching 400KB limit)
{
  PK: 'USER#123',
  SK: 'METADATA',
  orderHistory: [/* 1000s of orders */]  // Too large!
}

// RIGHT: Separate items
{
  PK: 'USER#123',
  SK: 'ORDER#2024-01-15#001'
}
{
  PK: 'USER#123',
  SK: 'ORDER#2024-01-14#002'
}
// Query with pagination

Anti-Pattern 4: Not Using Sort Keys

// WRONG: Only using partition key
{
  PK: 'USER#123',
  // No SK - missing query capabilities
  name: 'John'
}

// RIGHT: Always use sort keys for flexibility
{
  PK: 'USER#123',
  SK: 'METADATA',
  name: 'John'
}

Data Modeling Workflow

<svg viewBox="0 0 800 900" xmlns="http://www.w3.org/2000/svg">
  <!-- Title -->
  <text x="400" y="30" font-size="18" font-weight="bold" text-anchor="middle" fill="#333">
    DynamoDB Data Modeling Workflow
  </text>

  <!-- Step 1 -->
  <rect x="250" y="60" width="300" height="80" fill="#e3f2fd" stroke="#1976d2" stroke-width="2" rx="5"/>
  <text x="400" y="90" font-size="14" font-weight="bold" text-anchor="middle" fill="#1976d2">
    Step 1: Identify Entities
  </text>
  <text x="400" y="115" font-size="12" text-anchor="middle" fill="#333">
    List all entity types in your application
  </text>
  <text x="400" y="130" font-size="11" text-anchor="middle" fill="#666">
    (Users, Orders, Products, etc.)
  </text>

  <path d="M 400 140 L 400 170" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 2 -->
  <rect x="250" y="170" width="300" height="80" fill="#fff3e0" stroke="#f57c00" stroke-width="2" rx="5"/>
  <text x="400" y="200" font-size="14" font-weight="bold" text-anchor="middle" fill="#f57c00">
    Step 2: Define Relationships
  </text>
  <text x="400" y="225" font-size="12" text-anchor="middle" fill="#333">
    Map entity relationships
  </text>
  <text x="400" y="240" font-size="11" text-anchor="middle" fill="#666">
    (1:1, 1:N, N:M)
  </text>

  <path d="M 400 250 L 400 280" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 3 -->
  <rect x="250" y="280" width="300" height="80" fill="#e8f5e9" stroke="#4caf50" stroke-width="2" rx="5"/>
  <text x="400" y="310" font-size="14" font-weight="bold" text-anchor="middle" fill="#4caf50">
    Step 3: List Access Patterns
  </text>
  <text x="400" y="335" font-size="12" text-anchor="middle" fill="#333">
    Document ALL read/write patterns
  </text>
  <text x="400" y="350" font-size="11" text-anchor="middle" fill="#666">
    Include frequency and latency requirements
  </text>

  <path d="M 400 360 L 400 390" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 4 -->
  <rect x="250" y="390" width="300" height="80" fill="#f3e5f5" stroke="#9c27b0" stroke-width="2" rx="5"/>
  <text x="400" y="420" font-size="14" font-weight="bold" text-anchor="middle" fill="#9c27b0">
    Step 4: Design Primary Keys
  </text>
  <text x="400" y="445" font-size="12" text-anchor="middle" fill="#333">
    Create PK/SK patterns for main patterns
  </text>
  <text x="400" y="460" font-size="11" text-anchor="middle" fill="#666">
    Support 1-to-N queries with sort keys
  </text>

  <path d="M 400 470 L 400 500" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 5 -->
  <rect x="250" y="500" width="300" height="80" fill="#fce4ec" stroke="#e91e63" stroke-width="2" rx="5"/>
  <text x="400" y="530" font-size="14" font-weight="bold" text-anchor="middle" fill="#e91e63">
    Step 5: Add GSIs for Inverted Access
  </text>
  <text x="400" y="555" font-size="12" text-anchor="middle" fill="#333">
    Design GSI keys for remaining patterns
  </text>
  <text x="400" y="570" font-size="11" text-anchor="middle" fill="#666">
    Minimize GSI count (max 20 per table)
  </text>

  <path d="M 400 580 L 400 610" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 6 -->
  <rect x="250" y="610" width="300" height="80" fill="#e0f2f1" stroke="#009688" stroke-width="2" rx="5"/>
  <text x="400" y="640" font-size="14" font-weight="bold" text-anchor="middle" fill="#009688">
    Step 6: Optimize Item Structure
  </text>
  <text x="400" y="665" font-size="12" text-anchor="middle" fill="#333">
    Denormalize, add computed fields
  </text>
  <text x="400" y="680" font-size="11" text-anchor="middle" fill="#666">
    Keep items under 400KB
  </text>

  <path d="M 400 690 L 400 720" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Step 7 -->
  <rect x="250" y="720" width="300" height="80" fill="#fff9c4" stroke="#fbc02d" stroke-width="2" rx="5"/>
  <text x="400" y="750" font-size="14" font-weight="bold" text-anchor="middle" fill="#f57f17">
    Step 7: Validate and Test
  </text>
  <text x="400" y="775" font-size="12" text-anchor="middle" fill="#333">
    Verify all access patterns are supported
  </text>
  <text x="400" y="790" font-size="11" text-anchor="middle" fill="#666">
    Test with realistic data volumes
  </text>

  <path d="M 400 800 L 400 830" stroke="#666" stroke-width="2" fill="none" marker-end="url(#arrowhead)"/>

  <!-- Final Result -->
  <ellipse cx="400" cy="860" rx="120" ry="30" fill="#4caf50" stroke="#2e7d32" stroke-width="2"/>
  <text x="400" y="870" font-size="13" font-weight="bold" text-anchor="middle" fill="#fff">
    Optimized Data Model
  </text>

  <!-- Arrow marker -->
  <defs>
    <marker id="arrowhead" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
      <polygon points="0 0, 10 3, 0 6" fill="#666"/>
    </marker>
  </defs>
</svg>

Migration Strategies

Migrating from SQL to DynamoDB

// SQL Schema
CREATE TABLE users (
  user_id INT PRIMARY KEY,
  email VARCHAR(255) UNIQUE,
  name VARCHAR(255)
);

CREATE TABLE orders (
  order_id INT PRIMARY KEY,
  user_id INT,
  order_date DATE,
  total DECIMAL(10,2),
  FOREIGN KEY (user_id) REFERENCES users(user_id)
);

CREATE TABLE order_items (
  order_item_id INT PRIMARY KEY,
  order_id INT,
  product_id INT,
  quantity INT,
  FOREIGN KEY (order_id) REFERENCES orders(order_id)
);

// DynamoDB Single-Table Design

// Access patterns:
// 1. Get user by ID
// 2. Get user by email
// 3. Get user's orders
// 4. Get order by ID
// 5. Get order items

// Items:
{
  PK: 'USER#123',
  SK: 'METADATA',
  Type: 'User',
  email: '[email protected]',
  name: 'John Doe',
  GSI1PK: 'EMAIL#[email protected]',
  GSI1SK: 'USER#123'
}

{
  PK: 'USER#123',
  SK: 'ORDER#2024-01-15#001',
  Type: 'Order',
  orderId: '001',
  orderDate: '2024-01-15',
  total: 99.99,
  GSI1PK: 'ORDER#001',
  GSI1SK: 'METADATA'
}

{
  PK: 'ORDER#001',
  SK: 'ITEM#1',
  Type: 'OrderItem',
  orderItemId: '1',
  productId: '789',
  quantity: 2
}

Performance Optimization Through Data Modeling

1. Reduce Item Size

// BEFORE: Verbose attribute names (wastes space)
{
  PK: 'USER#123',
  SK: 'METADATA',
  user_identifier: '123',
  user_full_name: 'John Doe',
  user_email_address: '[email protected]',
  user_registration_date: '2024-01-01'
}

// AFTER: Shortened names (saves space, reduces costs)
{
  PK: 'USER#123',
  SK: 'METADATA',
  uid: '123',
  name: 'John Doe',
  email: '[email protected]',
  regDate: '2024-01-01'
}
// BEFORE: Multiple queries
const user = await getUser(userId);
const orders = await getOrders(userId);
const addresses = await getAddresses(userId);

// AFTER: Single query
const userData = await dynamodb.query({
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: {
    ':pk': `USER#${userId}`
  }
});

// Parse results
const user = userData.Items.find(i => i.SK === 'METADATA');
const orders = userData.Items.filter(i => i.SK.startsWith('ORDER#'));
const addresses = userData.Items.filter(i => i.SK.startsWith('ADDRESS#'));

3. Use Projection Expressions

// Retrieve only needed attributes
const result = await dynamodb.query({
  TableName: 'EcommerceApp',
  KeyConditionExpression: 'PK = :pk',
  ProjectionExpression: 'SK, #name, email',
  ExpressionAttributeNames: {
    '#name': 'name'
  },
  ExpressionAttributeValues: {
    ':pk': 'USER#123'
  }
});

Interview Questions and Answers

Question 1: What is single-table design and when should you use it?

Answer: Single-table design is a DynamoDB modeling approach where you store multiple entity types in one table using generic attribute names (PK, SK, GSI1PK, etc.) and differentiate entities using a Type attribute. Benefits:
  • Retrieve related items in a single query
  • Support transactions across entity types
  • Reduce operational complexity
  • Lower costs through fewer requests
Use when:
  • Entities have related access patterns
  • You need transactions across entity types
  • You want to minimize request count
  • Your access patterns are well-defined
Avoid when:
  • Entities have completely different scaling needs
  • Different security requirements per entity
  • Teams need separate ownership
  • Access patterns are unknown or highly variable

Question 2: How do you model a many-to-many relationship in DynamoDB?

Answer: Use the adjacency list pattern by creating bidirectional edges:
// Student-Class many-to-many

// Student -> Class edge
{
  PK: 'STUDENT#S001',
  SK: 'CLASS#C101',
  Type: 'Enrollment',
  studentId: 'S001',
  classId: 'C101',
  GSI1PK: 'CLASS#C101',  // Allows reverse lookup
  GSI1SK: 'STUDENT#S001'
}

// Class -> Student edge
{
  PK: 'CLASS#C101',
  SK: 'STUDENT#S001',
  Type: 'Enrollment',
  classId: 'C101',
  studentId: 'S001',
  GSI1PK: 'STUDENT#S001',  // Allows reverse lookup
  GSI1SK: 'CLASS#C101'
}

// Query all classes for a student
query({ PK: 'STUDENT#S001', SK begins_with 'CLASS#' })

// Query all students in a class
query({ PK: 'CLASS#C101', SK begins_with 'STUDENT#' })
This creates two items per relationship but enables efficient querying in both directions.

Question 3: Explain the difference between normalization in SQL and denormalization in DynamoDB.

Answer: SQL Normalization:
  • Eliminates redundancy by splitting data into multiple tables
  • Uses JOINs to recombine data at query time
  • Optimizes for write efficiency and consistency
  • Flexible querying but variable performance
DynamoDB Denormalization:
  • Duplicates data across items to avoid JOINs
  • Pre-joins data at write time
  • Optimizes for read efficiency and predictable performance
  • Fixed access patterns but consistent latency
Example:
// SQL (normalized)
users: { id: 1, name: 'John' }
orders: { id: 100, userId: 1, total: 99 }
// Requires JOIN to get user with orders

// DynamoDB (denormalized)
{
  PK: 'USER#1',
  SK: 'ORDER#100',
  userName: 'John',      // Denormalized
  orderId: '100',
  total: 99
}
// Single query gets user data with orders

Question 4: How do you avoid hot partitions in DynamoDB?

Answer: Strategies:
  1. Add Random Suffix:
const shardId = Math.floor(Math.random() * 100);
PK: `COUNTER#${shardId}`
  1. Use High-Cardinality Partition Keys:
// BAD: Low cardinality
PK: `STATUS#${status}`  // Only a few values

// GOOD: High cardinality
PK: `USER#${userId}`    // Millions of values
  1. Date-Based Partitioning:
PK: `METRICS#${date}#${randomSuffix}`
  1. Composite Keys:
PK: `CATEGORY#${category}#BRAND#${brand}`
  1. Write Sharding:
const shard = itemId % 10;
PK: `SHARD#${shard}#ITEM#${itemId}`

Question 5: What are the trade-offs between single-table and multi-table design?

Answer: Single-Table Design: Advantages:
  • Fewer requests (lower latency)
  • Transaction support across entities
  • Simpler operations
  • Lower costs
Disadvantages:
  • Requires known access patterns upfront
  • More complex schema
  • Harder to understand
  • Limited ad-hoc queries
Multi-Table Design: Advantages:
  • Clearer data organization
  • Independent scaling per table
  • Separate IAM policies
  • Easier to understand
Disadvantages:
  • More network requests
  • No transactions across tables
  • Higher operational overhead
  • Potentially higher costs
Decision criteria:
  • Use single-table when entities are tightly coupled
  • Use multi-table when entities have different requirements
  • Consider team structure and ownership
  • Evaluate security and compliance needs

Question 6: How do you handle hierarchical data in DynamoDB?

Answer: Use composite sort keys with hierarchy encoded in the SK:
// Organization hierarchy: Org -> Dept -> Team -> Employee

{
  PK: 'ORG#ACME',
  SK: 'METADATA',
  Type: 'Organization'
}

{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering',
  Type: 'Department'
}

{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering#TEAM#Backend',
  Type: 'Team'
}

{
  PK: 'ORG#ACME',
  SK: 'DEPT#Engineering#TEAM#Backend#EMP#123',
  Type: 'Employee'
}

// Query all levels:
// All depts: begins_with(SK, 'DEPT#')
// All teams in dept: begins_with(SK, 'DEPT#Engineering#TEAM#')
// All employees in team: begins_with(SK, 'DEPT#Engineering#TEAM#Backend#EMP#')
Benefits:
  • Single query retrieves any level of hierarchy
  • Natural ordering
  • Efficient range queries
  • No recursive queries needed

Question 7: Describe how to model time-series data efficiently in DynamoDB.

Answer: Key strategies:
  1. Time-based partition keys:
// Hour-based partitioning
{
  PK: 'SENSOR#S123#HOUR#2024-01-15-10',
  SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z',
  temperature: 72.5
}
  1. GSI for cross-partition queries:
{
  PK: 'SENSOR#S123#HOUR#2024-01-15-10',
  SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z',
  GSI1PK: 'SENSOR#S123',
  GSI1SK: 'TIMESTAMP#2024-01-15T10:30:45.123Z'
}
  1. Aggregations with TTL:
{
  PK: 'SENSOR#S123#AGG#HOURLY',
  SK: 'HOUR#2024-01-15-10',
  avgTemp: 72.3,
  minTemp: 70.1,
  maxTemp: 74.5,
  sampleCount: 120,
  ttl: 1737849600  // Expire old aggregations
}
  1. Cold data archival:
  • Use DynamoDB Streams to archive to S3
  • Keep recent data in DynamoDB
  • Query S3 for historical analysis

Question 8: How would you migrate from a relational database to DynamoDB?

Answer: Migration steps:
  1. Document access patterns:
Current SQL queries:
- SELECT * FROM users WHERE email = ?
- SELECT * FROM orders WHERE userId = ? ORDER BY date DESC
- SELECT oi.* FROM order_items oi JOIN orders o ON...
  1. Design DynamoDB schema:
// Map to DynamoDB access patterns
{
  PK: 'USER#123',
  SK: 'METADATA',
  GSI1PK: 'EMAIL#[email protected]'
}

{
  PK: 'USER#123',
  SK: 'ORDER#2024-01-15#001'
}
  1. Choose migration approach:
  • Gradual: Dual writes, gradual read cutover
  • Big bang: Full migration at once
  • Shadow mode: Run in parallel, compare results
  1. Implementation:
// Dual write example
async function createOrder(order) {
  // Write to both systems
  await sqlDb.insert(order);
  await dynamodb.putItem(transformToDynamo(order));
}
  1. Validation:
  • Compare query results
  • Monitor error rates
  • Performance testing
  1. Cutover:
  • Switch reads to DynamoDB
  • Stop SQL writes
  • Archive SQL data

Question 9: What is the adjacency list pattern?

Answer: The adjacency list pattern stores different entity types together using PK/SK to create relationships. It enables querying related items in a single request. Example - Social Network:
// User metadata
{
  PK: 'USER#alice',
  SK: 'METADATA',
  Type: 'User',
  name: 'Alice',
  email: '[email protected]'
}

// Alice's friend Bob
{
  PK: 'USER#alice',
  SK: 'FRIEND#bob',
  Type: 'Friendship',
  friendId: 'bob',
  friendName: 'Bob',
  since: '2024-01-01'
}

// Alice's post
{
  PK: 'USER#alice',
  SK: 'POST#2024-01-15#001',
  Type: 'Post',
  postId: '001',
  content: 'Hello world!',
  timestamp: '2024-01-15T10:00:00Z'
}

// Query gets user + friends + posts in one request
query({ PK: 'USER#alice' })
Benefits:
  • Single query for related data
  • Efficient 1-to-many relationships
  • Natural grouping of related items

Question 10: How do you handle schema evolution in DynamoDB?

Answer: Strategies:
  1. Schema versioning:
{
  PK: 'USER#123',
  SK: 'METADATA',
  schemaVersion: '2.0',
  // v2 fields
  name: 'John Doe',
  email: '[email protected]'
}

// Handle in application
function parseUser(item) {
  if (item.schemaVersion === '1.0') {
    return migrateV1ToV2(item);
  }
  return item;
}
  1. Additive changes:
// Just add new attributes
{
  PK: 'USER#123',
  SK: 'METADATA',
  name: 'John Doe',
  // New field added without breaking old code
  phoneNumber: '+1234567890'
}
  1. Lazy migration:
async function getUser(userId) {
  const item = await dynamodb.get({...});

  if (needsMigration(item)) {
    const migrated = migrate(item);
    await dynamodb.put(migrated);
    return migrated;
  }

  return item;
}
  1. GSI for new access patterns:
// Add GSI without changing existing items
{
  // Existing attributes
  PK: 'USER#123',
  SK: 'METADATA',

  // Add GSI keys for new pattern
  GSI2PK: 'COUNTRY#US',
  GSI2SK: 'USER#123'
}

Summary

Data modeling in DynamoDB requires a fundamental shift from traditional relational thinking: Key Principles:
  1. Access patterns drive design
  2. Denormalize for performance
  3. Use composite keys for flexibility
  4. Leverage GSIs for alternate access patterns
  5. Pre-compute and store aggregations
Best Practices:
  • Document all access patterns upfront
  • Prefer single-table design for related entities
  • Avoid hot partitions through sharding
  • Use sparse indexes efficiently
  • Version your schema for evolution
  • Test with realistic data volumes
Common Patterns:
  • Hierarchical data with composite sort keys
  • Many-to-many with adjacency lists
  • Time-series with date-based partitioning
  • Denormalized attributes for efficiency
Effective DynamoDB data modeling enables predictable performance, cost efficiency, and scalability at any scale.