Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Chapter 9: Deployment & Production
Deploying and running your NestJS app in production requires careful planning. This chapter covers Dockerization, CI/CD, environment management, health checks, logging, monitoring, scaling, and troubleshooting. We’ll walk through practical steps and explain how to make your app production-ready.
9.1 Preparing for Production
Before deploying, ensure your application is production-ready.
Production Checklist
Environment Configuration:
- Set
NODE_ENV=production
- Use environment variables for all secrets
- Remove hardcoded credentials
- Validate all environment variables
Security:
- Enable CORS with specific origins
- Set secure HTTP headers (helmet)
- Use HTTPS
- Validate all inputs
- Rate limiting enabled
Performance:
- Build optimized bundle (
npm run build)
- Remove dev dependencies
- Enable compression
- Optimize database queries
- Use connection pooling
Monitoring:
- Health checks configured
- Logging set up
- Error tracking (Sentry, etc.)
- Metrics collection
Testing:
- All tests passing
- Tested in staging environment
- Load testing completed
- Security audit done
Deployment Target Comparison
| Target | Setup Complexity | Scaling | Cost Model | Best For |
|---|
| VPS (DigitalOcean, Linode) | Low | Manual (PM2 cluster) | Fixed monthly | Small projects, tight budgets |
| PaaS (Railway, Render, Fly.io) | Very Low | Auto-scaling | Per-usage | Startups, quick deployment |
| AWS ECS / Fargate | Medium | Auto-scaling | Per-task-second | AWS shops, moderate scale |
| Kubernetes (EKS, GKE, AKS) | High | Full control, HPA | Per-node + overhead | Large teams, complex microservices |
| Serverless (Lambda + API GW) | Medium | Infinite auto-scaling | Per-invocation | Event-driven, sporadic traffic |
Decision Framework:
How many developers will maintain infrastructure?
0 (no DevOps) --> PaaS (Railway, Render) or Serverless
1-2 --> Docker on VPS or AWS ECS
3+ --> Kubernetes
How predictable is your traffic?
Steady --> VPS or containers (predictable cost)
Spiky --> Serverless or auto-scaling containers
Unknown --> Start with PaaS, migrate when you understand the pattern
Is cold start latency acceptable?
YES --> Serverless is fine
NO --> Containers (always running)
NestJS and Serverless: NestJS can run in AWS Lambda using @codegenie/serverless-adapter or @vendia/serverless-express, but the cold start is 1-3 seconds because NestJS bootstraps the DI container on every cold start. For latency-sensitive APIs, use provisioned concurrency or stick with containers. Serverless NestJS works well for internal tools, webhooks, and background processing where cold starts are acceptable.
9.2 Dockerizing Your App
Containerization makes deployment consistent and portable. Docker packages your app and dependencies into a single image.
Basic Dockerfile
This Dockerfile uses multi-stage builds — a critical Docker optimization. The builder stage has all dev dependencies (TypeScript compiler, etc.) to build the app, but the final image only contains the compiled JavaScript and production dependencies. This typically reduces image size from ~500MB to ~150MB.
# Stage 1: Build -- install everything, compile TypeScript
FROM node:18-alpine AS builder
WORKDIR /app
# Copy package files first. Docker caches this layer, so if dependencies
# haven't changed, npm ci is skipped on rebuild -- saving minutes.
COPY package*.json ./
# npm ci is preferred over npm install for CI/Docker:
# - It installs exact versions from package-lock.json (deterministic)
# - It is faster because it skips the dependency resolution step
# - It errors if package-lock.json is out of sync (catches mistakes)
RUN npm ci
# Copy source code after dependencies (better layer caching)
COPY . .
# Build the TypeScript into JavaScript in the /dist folder
RUN npm run build
# Stage 2: Production -- minimal image with only what's needed to run
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
# --only=production skips devDependencies (TypeScript, Jest, etc.)
RUN npm ci --only=production
# Copy the compiled JavaScript from the builder stage
COPY --from=builder /app/dist ./dist
# SECURITY: Never run containers as root. Create a dedicated non-root user.
# If an attacker exploits a vulnerability, they get limited permissions.
RUN addgroup -g 1001 -S nodejs && \
adduser -S nestjs -u 1001
USER nestjs
EXPOSE 3000
CMD ["node", "dist/main"]
Optimized Dockerfile
FROM node:18-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && \
npm cache clean --force
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM node:18-alpine
WORKDIR /app
# Copy production dependencies
COPY --from=deps /app/node_modules ./node_modules
# Copy built application
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package*.json ./
# Security: Run as non-root
RUN addgroup -g 1001 -S nodejs && \
adduser -S nestjs -u 1001 && \
chown -R nestjs:nodejs /app
USER nestjs
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD node -e "require('http').get('http://localhost:3000/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"
CMD ["node", "dist/main"]
.dockerignore
node_modules
npm-debug.log
dist
.git
.gitignore
.env
.env.local
*.md
.vscode
.idea
coverage
.nyc_output
test
*.spec.ts
Docker Compose
version: '3.8'
services:
app:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- DATABASE_URL=postgresql://user:password@db:5432/mydb
depends_on:
- db
restart: unless-stopped
db:
image: postgres:14-alpine
environment:
- POSTGRES_USER=user
- POSTGRES_PASSWORD=password
- POSTGRES_DB=mydb
volumes:
- postgres_data:/var/lib/postgresql/data
restart: unless-stopped
volumes:
postgres_data:
Best Practices:
- Use multi-stage builds for smaller images
- Keep images minimal (alpine base, no dev dependencies)
- Use
.dockerignore to exclude unnecessary files
- Run as non-root user
- Add health checks
- Use specific version tags
9.3 Environment Variables
Store secrets and configuration in environment variables. Never commit secrets to version control.
Using @nestjs/config
npm install @nestjs/config
// app.module.ts
import { Module } from '@nestjs/common';
import { ConfigModule } from '@nestjs/config';
@Module({
imports: [
ConfigModule.forRoot({
isGlobal: true,
envFilePath: ['.env.local', '.env'],
validationSchema: Joi.object({
NODE_ENV: Joi.string()
.valid('development', 'production', 'test')
.default('development'),
PORT: Joi.number().default(3000),
DATABASE_URL: Joi.string().required(),
JWT_SECRET: Joi.string().required(),
}),
}),
],
})
export class AppModule {}
Environment Files
# .env.example (committed)
NODE_ENV=development
PORT=3000
DATABASE_URL=postgresql://localhost:5432/mydb
JWT_SECRET=your-secret-key
# .env (not committed)
NODE_ENV=production
PORT=3000
DATABASE_URL=postgresql://user:password@db:5432/mydb
JWT_SECRET=super-secret-key-change-in-production
Using Config Service
import { Injectable } from '@nestjs/common';
import { ConfigService } from '@nestjs/config';
@Injectable()
export class AppService {
constructor(private configService: ConfigService) {}
getDatabaseUrl(): string {
return this.configService.get<string>('DATABASE_URL');
}
getJwtSecret(): string {
return this.configService.get<string>('JWT_SECRET');
}
}
Tip: Use schema validation (e.g., with joi) to ensure required environment variables are set and validate their values.
9.4 Health Checks
Health checks help load balancers and orchestrators know if your app is healthy.
Installing Terminus
npm install @nestjs/terminus
Health Check Controller
Health checks come in two flavors, and understanding the difference is critical for Kubernetes deployments:
- Liveness: “Is the process alive?” If this fails, Kubernetes restarts the pod. Keep it simple — do not check external dependencies here, or a database outage will cascade into pod restart storms.
- Readiness: “Can this instance handle traffic?” If this fails, Kubernetes removes the pod from the load balancer but does not restart it. Check database connectivity and other dependencies here.
// health/health.controller.ts
import { Controller, Get } from '@nestjs/common';
import { HealthCheck, HealthCheckService, TypeOrmHealthIndicator } from '@nestjs/terminus';
@Controller('health')
export class HealthController {
constructor(
private health: HealthCheckService,
private db: TypeOrmHealthIndicator,
) {}
// General health check -- used by simple load balancers
@Get()
@HealthCheck()
check() {
return this.health.check([
() => this.db.pingCheck('database'),
]);
}
// Readiness probe: "Can I handle traffic?"
// Checks dependencies. If the database is down, this returns unhealthy,
// and the load balancer stops sending requests to this instance.
@Get('readiness')
@HealthCheck()
readiness() {
return this.health.check([
() => this.db.pingCheck('database'),
]);
}
// Liveness probe: "Am I alive?"
// Does NOT check dependencies. Just confirms the process is responsive.
// If this fails, something is fundamentally wrong (deadlock, OOM).
@Get('liveness')
@HealthCheck()
liveness() {
return { status: 'ok', timestamp: new Date().toISOString() };
}
}
Common Mistake: Including database checks in the liveness probe. If the database goes down temporarily, Kubernetes will restart all your pods simultaneously, creating a thundering herd that makes recovery harder. Only check the process itself in liveness probes.
Custom Health Indicators
import { Injectable } from '@nestjs/common';
import { HealthIndicator, HealthIndicatorResult, HealthCheckError } from '@nestjs/terminus';
@Injectable()
export class CustomHealthIndicator extends HealthIndicator {
async isHealthy(key: string): Promise<HealthIndicatorResult> {
const isHealthy = await this.checkExternalService();
const result = this.getStatus(key, isHealthy);
if (isHealthy) {
return result;
}
throw new HealthCheckError('External service check failed', result);
}
private async checkExternalService(): Promise<boolean> {
// Check external service
return true;
}
}
Diagram: Health Check Flow
Load Balancer/Orchestrator
↓
GET /health
↓
Health Check Service
↓
Check Database, External Services, etc.
↓
Return Status (healthy/unhealthy)
9.5 Logging & Monitoring
Proper logging and monitoring are essential for production applications.
NestJS Logger
The built-in Logger class is context-aware — passing the class name to the constructor means every log line includes the service name, making it easy to filter logs in production.
import { Logger } from '@nestjs/common';
@Injectable()
export class UsersService {
// Passing UsersService.name as the context means logs appear as:
// [Nest] 12345 - 04/10/2026 [UsersService] Creating user: alice@example.com
// This is invaluable when you have 50 services and need to find
// where a specific log message came from.
private readonly logger = new Logger(UsersService.name);
async create(dto: CreateUserDto) {
this.logger.log(`Creating user: ${dto.email}`);
try {
const user = await this.userRepository.create(dto);
this.logger.log(`User created successfully: ${user.id}`);
return user;
} catch (error) {
// .error() takes the message as the first argument and the stack trace
// as the second. Always include the stack trace -- without it, you only
// know WHAT failed, not WHERE in the code it failed.
this.logger.error(`Failed to create user: ${error.message}`, error.stack);
throw error;
}
}
}
Production Tip: The default logger outputs human-readable text, which is great for local development but terrible for cloud log aggregation. In production, replace it with a JSON logger (Winston, Pino) so tools like ELK, Datadog, or CloudWatch can parse and index your log fields automatically.
Structured Logging
import { Logger } from '@nestjs/common';
@Injectable()
export class LoggerService {
private readonly logger = new Logger();
log(message: string, context?: string, metadata?: any) {
this.logger.log(JSON.stringify({
message,
context,
metadata,
timestamp: new Date().toISOString(),
}));
}
error(message: string, trace?: string, context?: string) {
this.logger.error(JSON.stringify({
message,
trace,
context,
timestamp: new Date().toISOString(),
}));
}
}
Winston Integration
npm install nest-winston winston
import { Module } from '@nestjs/common';
import { WinstonModule } from 'nest-winston';
import * as winston from 'winston';
@Module({
imports: [
WinstonModule.forRoot({
transports: [
new winston.transports.File({
filename: 'error.log',
level: 'error',
}),
new winston.transports.File({
filename: 'combined.log',
}),
],
}),
],
})
export class AppModule {}
Error Tracking with Sentry
npm install @sentry/node @sentry/tracing
// main.ts
import * as Sentry from '@sentry/node';
import { nodeProfilingIntegration } from '@sentry/profiling-node';
Sentry.init({
dsn: process.env.SENTRY_DSN,
integrations: [nodeProfilingIntegration()],
tracesSampleRate: 1.0,
profilesSampleRate: 1.0,
});
Best Practices:
- Log errors and warnings
- Use structured logs (JSON) for cloud platforms
- Monitor logs and metrics in real time
- Set up alerts for critical errors
- Don’t log sensitive information
- Use log levels appropriately
9.6 CI/CD Pipelines
Automate build, test, and deployment with CI/CD pipelines.
GitHub Actions Workflow
# .github/workflows/deploy.yml
name: Deploy
on:
push:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: '18'
cache: 'npm'
- run: npm ci
- run: npm run lint
- run: npm run test
- run: npm run test:e2e
build:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build Docker image
run: docker build -t myapp:${{ github.sha }} .
- name: Push to registry
run: |
echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
docker push myapp:${{ github.sha }}
deploy:
needs: build
runs-on: ubuntu-latest
steps:
- name: Deploy to production
run: |
kubectl set image deployment/myapp myapp=myapp:${{ github.sha }}
GitLab CI
# .gitlab-ci.yml
stages:
- test
- build
- deploy
test:
stage: test
script:
- npm ci
- npm run lint
- npm run test
- npm run test:e2e
build:
stage: build
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
deploy:
stage: deploy
script:
- kubectl set image deployment/myapp myapp=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
9.7 Kubernetes Deployment
Deploy NestJS applications to Kubernetes for scalability and reliability.
Deployment Manifest
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nestjs-app
spec:
replicas: 3
selector:
matchLabels:
app: nestjs-app
template:
metadata:
labels:
app: nestjs-app
spec:
containers:
- name: app
image: myapp:latest
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: "production"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-url
livenessProbe:
httpGet:
path: /health/liveness
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/readiness
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
Service Manifest
# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
name: nestjs-app-service
spec:
selector:
app: nestjs-app
ports:
- port: 80
targetPort: 3000
type: LoadBalancer
9.8 Scaling & High Availability
Scale your application to handle increased load.
Horizontal Scaling
Run multiple instances behind a load balancer. This is the primary scaling strategy for NestJS applications.
Key Requirements for Horizontal Scaling:
Your NestJS application must be stateless to scale horizontally. This means:
- No in-memory sessions (use Redis for sessions)
- No in-memory caches that are not shared (use Redis)
- No file uploads stored on the local filesystem (use S3 or equivalent)
- No WebSocket connections without a Redis adapter (Socket.io with
@socket.io/redis-adapter)
| Scaling Strategy | Trigger | Tool | Typical Latency |
|---|
| Manual | Developer decision | PM2 cluster mode, pm2 start app -i max | N/A |
| Kubernetes HPA | CPU/memory threshold | kubectl autoscale deployment app --min=2 --max=10 | 30-60s |
| Cloud Auto-scaling | Request count, latency | AWS ALB + ECS/Fargate auto-scaling | 60-120s |
| Serverless | Per-request | AWS Lambda, Cloud Functions | 0s (always ready) to 1-3s (cold start) |
Vertical Scaling
Increase instance size (more CPU/memory). This is a valid first step before going horizontal — a single well-provisioned instance can handle thousands of requests per second. Only scale horizontally when vertical scaling becomes cost-prohibitive or you need fault tolerance.
Database Scaling
| Strategy | When to Use | Complexity |
|---|
| Connection Pooling | Always (default pool is often too small) | Low |
| Read Replicas | Read-heavy workloads (>80% reads) | Medium |
| Caching (Redis) | Hot data accessed repeatedly | Medium |
| Sharding | Very large datasets (100M+ rows) | High |
| CQRS with separate read DB | Different read/write performance needs | High |
Caching
import { CacheModule } from '@nestjs/cache-manager';
import { redisStore } from 'cache-manager-redis-store';
@Module({
imports: [
CacheModule.register({
store: redisStore,
host: 'localhost',
port: 6379,
}),
],
})
export class AppModule {}
Optimize your application for production performance.
Enable Compression
// main.ts
import * as compression from 'compression';
async function bootstrap() {
const app = await NestFactory.create(AppModule);
app.use(compression());
await app.listen(3000);
}
Connection Pooling
TypeOrmModule.forRoot({
// ... other options
extra: {
max: 10,
min: 2,
idleTimeoutMillis: 30000,
},
})
Query Optimization
- Use indexes on frequently queried columns
- Optimize N+1 queries
- Use select to limit fields
- Implement pagination
9.10 Production Edge Cases
Edge Case 1: Graceful shutdown and in-flight requests
When Kubernetes sends SIGTERM to your pod, your NestJS app needs to stop accepting new requests, finish processing in-flight requests, close database connections, and then exit. Without graceful shutdown, users get dropped connections and database transactions may be left in an inconsistent state.
// main.ts
async function bootstrap() {
const app = await NestFactory.create(AppModule);
// Enable graceful shutdown hooks
app.enableShutdownHooks();
await app.listen(3000);
}
NestJS will call onModuleDestroy() and onApplicationShutdown() lifecycle hooks on your providers. The PrismaService example in Chapter 5 uses onModuleDestroy() to close the database connection. Kubernetes gives you 30 seconds by default (configurable via terminationGracePeriodSeconds).
Edge Case 2: Memory leaks from event listeners
If your service subscribes to events (EventEmitter, RxJS subjects, WebSocket events) in the constructor but never unsubscribes, every hot reload in development and every new request-scoped instance in production leaks a listener. Implement OnModuleDestroy and clean up subscriptions.
Edge Case 3: Node.js single-thread and CPU-bound operations
NestJS runs on Node.js, which is single-threaded. If a service method does CPU-intensive work (JSON parsing a 50MB file, image processing, cryptographic operations beyond bcrypt), it blocks the entire event loop and all other requests freeze. Solutions: (1) Use worker threads (worker_threads module); (2) Offload to a separate microservice; (3) Use a job queue (Bull) that processes CPU-intensive work in a separate process.
Edge Case 4: Docker image size and startup time
A NestJS Docker image with all node_modules can be 500MB+. This matters for Kubernetes pod startup time (image pull takes 10-30 seconds) and serverless cold starts. The multi-stage Dockerfile in section 9.2 reduces this to ~150MB. Further optimization: use pnpm with --prod flag or node-prune to strip test files and documentation from node_modules.
9.10 Troubleshooting & Maintenance
Monitor and maintain your production application.
Monitoring
- Monitor CPU, memory, and response times
- Track error rates
- Monitor database performance
- Set up alerts for anomalies
Logging
- Centralize logs (ELK, CloudWatch, etc.)
- Search and filter logs
- Set up log retention policies
- Monitor log volumes
Backup & Recovery
- Regular database backups
- Test restore procedures
- Document recovery steps
- Store backups securely
Updates
- Regularly update dependencies
- Test updates in staging
- Use semantic versioning
- Document breaking changes
9.11 Summary
You’ve learned how to deploy and maintain NestJS applications in production:
Key Concepts:
- Docker: Containerize applications
- CI/CD: Automate deployment
- Health Checks: Monitor application health
- Logging: Track application behavior
- Monitoring: Observe production systems
- Scaling: Handle increased load
- Kubernetes: Orchestrate containers
Best Practices:
- Use environment variables for configuration
- Containerize with Docker
- Implement health checks
- Set up proper logging
- Monitor production systems
- Scale horizontally
- Regular backups and updates
Next Chapter: Learn about advanced patterns like CQRS, GraphQL, WebSockets, and event sourcing.