Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Chapter 9: Deployment & Production

Deploying and running your NestJS app in production requires careful planning. This chapter covers Dockerization, CI/CD, environment management, health checks, logging, monitoring, scaling, and troubleshooting. We’ll walk through practical steps and explain how to make your app production-ready.

9.1 Preparing for Production

Before deploying, ensure your application is production-ready.

Production Checklist

Environment Configuration:
  • Set NODE_ENV=production
  • Use environment variables for all secrets
  • Remove hardcoded credentials
  • Validate all environment variables
Security:
  • Enable CORS with specific origins
  • Set secure HTTP headers (helmet)
  • Use HTTPS
  • Validate all inputs
  • Rate limiting enabled
Performance:
  • Build optimized bundle (npm run build)
  • Remove dev dependencies
  • Enable compression
  • Optimize database queries
  • Use connection pooling
Monitoring:
  • Health checks configured
  • Logging set up
  • Error tracking (Sentry, etc.)
  • Metrics collection
Testing:
  • All tests passing
  • Tested in staging environment
  • Load testing completed
  • Security audit done

Deployment Target Comparison

TargetSetup ComplexityScalingCost ModelBest For
VPS (DigitalOcean, Linode)LowManual (PM2 cluster)Fixed monthlySmall projects, tight budgets
PaaS (Railway, Render, Fly.io)Very LowAuto-scalingPer-usageStartups, quick deployment
AWS ECS / FargateMediumAuto-scalingPer-task-secondAWS shops, moderate scale
Kubernetes (EKS, GKE, AKS)HighFull control, HPAPer-node + overheadLarge teams, complex microservices
Serverless (Lambda + API GW)MediumInfinite auto-scalingPer-invocationEvent-driven, sporadic traffic
Decision Framework:
How many developers will maintain infrastructure?
  0 (no DevOps) --> PaaS (Railway, Render) or Serverless
  1-2           --> Docker on VPS or AWS ECS
  3+            --> Kubernetes

How predictable is your traffic?
  Steady        --> VPS or containers (predictable cost)
  Spiky         --> Serverless or auto-scaling containers
  Unknown       --> Start with PaaS, migrate when you understand the pattern

Is cold start latency acceptable?
  YES           --> Serverless is fine
  NO            --> Containers (always running)
NestJS and Serverless: NestJS can run in AWS Lambda using @codegenie/serverless-adapter or @vendia/serverless-express, but the cold start is 1-3 seconds because NestJS bootstraps the DI container on every cold start. For latency-sensitive APIs, use provisioned concurrency or stick with containers. Serverless NestJS works well for internal tools, webhooks, and background processing where cold starts are acceptable.

9.2 Dockerizing Your App

Containerization makes deployment consistent and portable. Docker packages your app and dependencies into a single image.

Basic Dockerfile

This Dockerfile uses multi-stage builds — a critical Docker optimization. The builder stage has all dev dependencies (TypeScript compiler, etc.) to build the app, but the final image only contains the compiled JavaScript and production dependencies. This typically reduces image size from ~500MB to ~150MB.
# Stage 1: Build -- install everything, compile TypeScript
FROM node:18-alpine AS builder

WORKDIR /app

# Copy package files first. Docker caches this layer, so if dependencies
# haven't changed, npm ci is skipped on rebuild -- saving minutes.
COPY package*.json ./

# npm ci is preferred over npm install for CI/Docker:
# - It installs exact versions from package-lock.json (deterministic)
# - It is faster because it skips the dependency resolution step
# - It errors if package-lock.json is out of sync (catches mistakes)
RUN npm ci

# Copy source code after dependencies (better layer caching)
COPY . .

# Build the TypeScript into JavaScript in the /dist folder
RUN npm run build

# Stage 2: Production -- minimal image with only what's needed to run
FROM node:18-alpine

WORKDIR /app

COPY package*.json ./

# --only=production skips devDependencies (TypeScript, Jest, etc.)
RUN npm ci --only=production

# Copy the compiled JavaScript from the builder stage
COPY --from=builder /app/dist ./dist

# SECURITY: Never run containers as root. Create a dedicated non-root user.
# If an attacker exploits a vulnerability, they get limited permissions.
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nestjs -u 1001

USER nestjs

EXPOSE 3000

CMD ["node", "dist/main"]

Optimized Dockerfile

FROM node:18-alpine AS deps

WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production && \
    npm cache clean --force

FROM node:18-alpine AS builder

WORKDIR /app

COPY package*.json ./
RUN npm ci

COPY . .
RUN npm run build

FROM node:18-alpine

WORKDIR /app

# Copy production dependencies
COPY --from=deps /app/node_modules ./node_modules

# Copy built application
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package*.json ./

# Security: Run as non-root
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nestjs -u 1001 && \
    chown -R nestjs:nodejs /app

USER nestjs

EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD node -e "require('http').get('http://localhost:3000/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"

CMD ["node", "dist/main"]

.dockerignore

node_modules
npm-debug.log
dist
.git
.gitignore
.env
.env.local
*.md
.vscode
.idea
coverage
.nyc_output
test
*.spec.ts

Docker Compose

version: '3.8'

services:
  app:
    build: .
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - DATABASE_URL=postgresql://user:password@db:5432/mydb
    depends_on:
      - db
    restart: unless-stopped

  db:
    image: postgres:14-alpine
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=password
      - POSTGRES_DB=mydb
    volumes:
      - postgres_data:/var/lib/postgresql/data
    restart: unless-stopped

volumes:
  postgres_data:
Best Practices:
  • Use multi-stage builds for smaller images
  • Keep images minimal (alpine base, no dev dependencies)
  • Use .dockerignore to exclude unnecessary files
  • Run as non-root user
  • Add health checks
  • Use specific version tags

9.3 Environment Variables

Store secrets and configuration in environment variables. Never commit secrets to version control.

Using @nestjs/config

npm install @nestjs/config
// app.module.ts
import { Module } from '@nestjs/common';
import { ConfigModule } from '@nestjs/config';

@Module({
  imports: [
    ConfigModule.forRoot({
      isGlobal: true,
      envFilePath: ['.env.local', '.env'],
      validationSchema: Joi.object({
        NODE_ENV: Joi.string()
          .valid('development', 'production', 'test')
          .default('development'),
        PORT: Joi.number().default(3000),
        DATABASE_URL: Joi.string().required(),
        JWT_SECRET: Joi.string().required(),
      }),
    }),
  ],
})
export class AppModule {}

Environment Files

# .env.example (committed)
NODE_ENV=development
PORT=3000
DATABASE_URL=postgresql://localhost:5432/mydb
JWT_SECRET=your-secret-key

# .env (not committed)
NODE_ENV=production
PORT=3000
DATABASE_URL=postgresql://user:password@db:5432/mydb
JWT_SECRET=super-secret-key-change-in-production

Using Config Service

import { Injectable } from '@nestjs/common';
import { ConfigService } from '@nestjs/config';

@Injectable()
export class AppService {
  constructor(private configService: ConfigService) {}

  getDatabaseUrl(): string {
    return this.configService.get<string>('DATABASE_URL');
  }

  getJwtSecret(): string {
    return this.configService.get<string>('JWT_SECRET');
  }
}
Tip: Use schema validation (e.g., with joi) to ensure required environment variables are set and validate their values.

9.4 Health Checks

Health checks help load balancers and orchestrators know if your app is healthy.

Installing Terminus

npm install @nestjs/terminus

Health Check Controller

Health checks come in two flavors, and understanding the difference is critical for Kubernetes deployments:
  • Liveness: “Is the process alive?” If this fails, Kubernetes restarts the pod. Keep it simple — do not check external dependencies here, or a database outage will cascade into pod restart storms.
  • Readiness: “Can this instance handle traffic?” If this fails, Kubernetes removes the pod from the load balancer but does not restart it. Check database connectivity and other dependencies here.
// health/health.controller.ts
import { Controller, Get } from '@nestjs/common';
import { HealthCheck, HealthCheckService, TypeOrmHealthIndicator } from '@nestjs/terminus';

@Controller('health')
export class HealthController {
  constructor(
    private health: HealthCheckService,
    private db: TypeOrmHealthIndicator,
  ) {}

  // General health check -- used by simple load balancers
  @Get()
  @HealthCheck()
  check() {
    return this.health.check([
      () => this.db.pingCheck('database'),
    ]);
  }

  // Readiness probe: "Can I handle traffic?"
  // Checks dependencies. If the database is down, this returns unhealthy,
  // and the load balancer stops sending requests to this instance.
  @Get('readiness')
  @HealthCheck()
  readiness() {
    return this.health.check([
      () => this.db.pingCheck('database'),
    ]);
  }

  // Liveness probe: "Am I alive?"
  // Does NOT check dependencies. Just confirms the process is responsive.
  // If this fails, something is fundamentally wrong (deadlock, OOM).
  @Get('liveness')
  @HealthCheck()
  liveness() {
    return { status: 'ok', timestamp: new Date().toISOString() };
  }
}
Common Mistake: Including database checks in the liveness probe. If the database goes down temporarily, Kubernetes will restart all your pods simultaneously, creating a thundering herd that makes recovery harder. Only check the process itself in liveness probes.

Custom Health Indicators

import { Injectable } from '@nestjs/common';
import { HealthIndicator, HealthIndicatorResult, HealthCheckError } from '@nestjs/terminus';

@Injectable()
export class CustomHealthIndicator extends HealthIndicator {
  async isHealthy(key: string): Promise<HealthIndicatorResult> {
    const isHealthy = await this.checkExternalService();
    const result = this.getStatus(key, isHealthy);

    if (isHealthy) {
      return result;
    }
    throw new HealthCheckError('External service check failed', result);
  }

  private async checkExternalService(): Promise<boolean> {
    // Check external service
    return true;
  }
}
Diagram: Health Check Flow
Load Balancer/Orchestrator

GET /health

Health Check Service

Check Database, External Services, etc.

Return Status (healthy/unhealthy)

9.5 Logging & Monitoring

Proper logging and monitoring are essential for production applications.

NestJS Logger

The built-in Logger class is context-aware — passing the class name to the constructor means every log line includes the service name, making it easy to filter logs in production.
import { Logger } from '@nestjs/common';

@Injectable()
export class UsersService {
  // Passing UsersService.name as the context means logs appear as:
  // [Nest] 12345 - 04/10/2026 [UsersService] Creating user: alice@example.com
  // This is invaluable when you have 50 services and need to find
  // where a specific log message came from.
  private readonly logger = new Logger(UsersService.name);

  async create(dto: CreateUserDto) {
    this.logger.log(`Creating user: ${dto.email}`);
    
    try {
      const user = await this.userRepository.create(dto);
      this.logger.log(`User created successfully: ${user.id}`);
      return user;
    } catch (error) {
      // .error() takes the message as the first argument and the stack trace
      // as the second. Always include the stack trace -- without it, you only
      // know WHAT failed, not WHERE in the code it failed.
      this.logger.error(`Failed to create user: ${error.message}`, error.stack);
      throw error;
    }
  }
}
Production Tip: The default logger outputs human-readable text, which is great for local development but terrible for cloud log aggregation. In production, replace it with a JSON logger (Winston, Pino) so tools like ELK, Datadog, or CloudWatch can parse and index your log fields automatically.

Structured Logging

import { Logger } from '@nestjs/common';

@Injectable()
export class LoggerService {
  private readonly logger = new Logger();

  log(message: string, context?: string, metadata?: any) {
    this.logger.log(JSON.stringify({
      message,
      context,
      metadata,
      timestamp: new Date().toISOString(),
    }));
  }

  error(message: string, trace?: string, context?: string) {
    this.logger.error(JSON.stringify({
      message,
      trace,
      context,
      timestamp: new Date().toISOString(),
    }));
  }
}

Winston Integration

npm install nest-winston winston
import { Module } from '@nestjs/common';
import { WinstonModule } from 'nest-winston';
import * as winston from 'winston';

@Module({
  imports: [
    WinstonModule.forRoot({
      transports: [
        new winston.transports.File({
          filename: 'error.log',
          level: 'error',
        }),
        new winston.transports.File({
          filename: 'combined.log',
        }),
      ],
    }),
  ],
})
export class AppModule {}

Error Tracking with Sentry

npm install @sentry/node @sentry/tracing
// main.ts
import * as Sentry from '@sentry/node';
import { nodeProfilingIntegration } from '@sentry/profiling-node';

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  integrations: [nodeProfilingIntegration()],
  tracesSampleRate: 1.0,
  profilesSampleRate: 1.0,
});
Best Practices:
  • Log errors and warnings
  • Use structured logs (JSON) for cloud platforms
  • Monitor logs and metrics in real time
  • Set up alerts for critical errors
  • Don’t log sensitive information
  • Use log levels appropriately

9.6 CI/CD Pipelines

Automate build, test, and deployment with CI/CD pipelines.

GitHub Actions Workflow

# .github/workflows/deploy.yml
name: Deploy

on:
  push:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-node@v3
        with:
          node-version: '18'
          cache: 'npm'
      - run: npm ci
      - run: npm run lint
      - run: npm run test
      - run: npm run test:e2e

  build:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Build Docker image
        run: docker build -t myapp:${{ github.sha }} .
      - name: Push to registry
        run: |
          echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
          docker push myapp:${{ github.sha }}

  deploy:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to production
        run: |
          kubectl set image deployment/myapp myapp=myapp:${{ github.sha }}

GitLab CI

# .gitlab-ci.yml
stages:
  - test
  - build
  - deploy

test:
  stage: test
  script:
    - npm ci
    - npm run lint
    - npm run test
    - npm run test:e2e

build:
  stage: build
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

deploy:
  stage: deploy
  script:
    - kubectl set image deployment/myapp myapp=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

9.7 Kubernetes Deployment

Deploy NestJS applications to Kubernetes for scalability and reliability.

Deployment Manifest

# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nestjs-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nestjs-app
  template:
    metadata:
      labels:
        app: nestjs-app
    spec:
      containers:
      - name: app
        image: myapp:latest
        ports:
        - containerPort: 3000
        env:
        - name: NODE_ENV
          value: "production"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: app-secrets
              key: database-url
        livenessProbe:
          httpGet:
            path: /health/liveness
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health/readiness
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"

Service Manifest

# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: nestjs-app-service
spec:
  selector:
    app: nestjs-app
  ports:
  - port: 80
    targetPort: 3000
  type: LoadBalancer

9.8 Scaling & High Availability

Scale your application to handle increased load.

Horizontal Scaling

Run multiple instances behind a load balancer. This is the primary scaling strategy for NestJS applications. Key Requirements for Horizontal Scaling: Your NestJS application must be stateless to scale horizontally. This means:
  • No in-memory sessions (use Redis for sessions)
  • No in-memory caches that are not shared (use Redis)
  • No file uploads stored on the local filesystem (use S3 or equivalent)
  • No WebSocket connections without a Redis adapter (Socket.io with @socket.io/redis-adapter)
Scaling StrategyTriggerToolTypical Latency
ManualDeveloper decisionPM2 cluster mode, pm2 start app -i maxN/A
Kubernetes HPACPU/memory thresholdkubectl autoscale deployment app --min=2 --max=1030-60s
Cloud Auto-scalingRequest count, latencyAWS ALB + ECS/Fargate auto-scaling60-120s
ServerlessPer-requestAWS Lambda, Cloud Functions0s (always ready) to 1-3s (cold start)

Vertical Scaling

Increase instance size (more CPU/memory). This is a valid first step before going horizontal — a single well-provisioned instance can handle thousands of requests per second. Only scale horizontally when vertical scaling becomes cost-prohibitive or you need fault tolerance.

Database Scaling

StrategyWhen to UseComplexity
Connection PoolingAlways (default pool is often too small)Low
Read ReplicasRead-heavy workloads (>80% reads)Medium
Caching (Redis)Hot data accessed repeatedlyMedium
ShardingVery large datasets (100M+ rows)High
CQRS with separate read DBDifferent read/write performance needsHigh

Caching

import { CacheModule } from '@nestjs/cache-manager';
import { redisStore } from 'cache-manager-redis-store';

@Module({
  imports: [
    CacheModule.register({
      store: redisStore,
      host: 'localhost',
      port: 6379,
    }),
  ],
})
export class AppModule {}

9.9 Performance Optimization

Optimize your application for production performance.

Enable Compression

// main.ts
import * as compression from 'compression';

async function bootstrap() {
  const app = await NestFactory.create(AppModule);
  app.use(compression());
  await app.listen(3000);
}

Connection Pooling

TypeOrmModule.forRoot({
  // ... other options
  extra: {
    max: 10,
    min: 2,
    idleTimeoutMillis: 30000,
  },
})

Query Optimization

  • Use indexes on frequently queried columns
  • Optimize N+1 queries
  • Use select to limit fields
  • Implement pagination

9.10 Production Edge Cases

Edge Case 1: Graceful shutdown and in-flight requests When Kubernetes sends SIGTERM to your pod, your NestJS app needs to stop accepting new requests, finish processing in-flight requests, close database connections, and then exit. Without graceful shutdown, users get dropped connections and database transactions may be left in an inconsistent state.
// main.ts
async function bootstrap() {
  const app = await NestFactory.create(AppModule);

  // Enable graceful shutdown hooks
  app.enableShutdownHooks();

  await app.listen(3000);
}
NestJS will call onModuleDestroy() and onApplicationShutdown() lifecycle hooks on your providers. The PrismaService example in Chapter 5 uses onModuleDestroy() to close the database connection. Kubernetes gives you 30 seconds by default (configurable via terminationGracePeriodSeconds). Edge Case 2: Memory leaks from event listeners If your service subscribes to events (EventEmitter, RxJS subjects, WebSocket events) in the constructor but never unsubscribes, every hot reload in development and every new request-scoped instance in production leaks a listener. Implement OnModuleDestroy and clean up subscriptions. Edge Case 3: Node.js single-thread and CPU-bound operations NestJS runs on Node.js, which is single-threaded. If a service method does CPU-intensive work (JSON parsing a 50MB file, image processing, cryptographic operations beyond bcrypt), it blocks the entire event loop and all other requests freeze. Solutions: (1) Use worker threads (worker_threads module); (2) Offload to a separate microservice; (3) Use a job queue (Bull) that processes CPU-intensive work in a separate process. Edge Case 4: Docker image size and startup time A NestJS Docker image with all node_modules can be 500MB+. This matters for Kubernetes pod startup time (image pull takes 10-30 seconds) and serverless cold starts. The multi-stage Dockerfile in section 9.2 reduces this to ~150MB. Further optimization: use pnpm with --prod flag or node-prune to strip test files and documentation from node_modules.

9.10 Troubleshooting & Maintenance

Monitor and maintain your production application.

Monitoring

  • Monitor CPU, memory, and response times
  • Track error rates
  • Monitor database performance
  • Set up alerts for anomalies

Logging

  • Centralize logs (ELK, CloudWatch, etc.)
  • Search and filter logs
  • Set up log retention policies
  • Monitor log volumes

Backup & Recovery

  • Regular database backups
  • Test restore procedures
  • Document recovery steps
  • Store backups securely

Updates

  • Regularly update dependencies
  • Test updates in staging
  • Use semantic versioning
  • Document breaking changes

9.11 Summary

You’ve learned how to deploy and maintain NestJS applications in production: Key Concepts:
  • Docker: Containerize applications
  • CI/CD: Automate deployment
  • Health Checks: Monitor application health
  • Logging: Track application behavior
  • Monitoring: Observe production systems
  • Scaling: Handle increased load
  • Kubernetes: Orchestrate containers
Best Practices:
  • Use environment variables for configuration
  • Containerize with Docker
  • Implement health checks
  • Set up proper logging
  • Monitor production systems
  • Scale horizontally
  • Regular backups and updates
Next Chapter: Learn about advanced patterns like CQRS, GraphQL, WebSockets, and event sourcing.