Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Performance Optimization & Caching
Performance isn’t just about speed—it’s about efficiency, scalability, and cost. A well-optimized Node.js application can handle 10x more traffic on the same hardware.
| Bottleneck | Symptoms | Solution |
|---|
| CPU-bound | High CPU, slow responses | Clustering, worker threads |
| Memory-bound | High memory, OOM crashes | Memory optimization, streaming |
| I/O-bound | Waiting on DB, network | Caching, connection pooling |
| Event loop blocking | All requests slow down | Async operations, offload work |
Built-in Profiling
// Basic timing
console.time('operation');
// ... do something
console.timeEnd('operation'); // operation: 123ms
// Performance hooks
const { performance, PerformanceObserver } = require('perf_hooks');
const obs = new PerformanceObserver((items) => {
items.getEntries().forEach((entry) => {
console.log(`${entry.name}: ${entry.duration}ms`);
});
});
obs.observe({ entryTypes: ['measure'] });
performance.mark('start');
// ... operation
performance.mark('end');
performance.measure('My Operation', 'start', 'end');
Load Testing with autocannon
npm install -g autocannon
# Basic load test
autocannon http://localhost:3000/api/users
# Custom options
autocannon -c 100 -d 30 -p 10 http://localhost:3000/api/users
# -c: connections (concurrent)
# -d: duration in seconds
# -p: pipelining factor
Memory Profiling
// Memory usage
const used = process.memoryUsage();
console.log({
rss: `${Math.round(used.rss / 1024 / 1024)} MB`, // Total memory
heapTotal: `${Math.round(used.heapTotal / 1024 / 1024)} MB`, // V8 heap
heapUsed: `${Math.round(used.heapUsed / 1024 / 1024)} MB`, // Used heap
external: `${Math.round(used.external / 1024 / 1024)} MB` // C++ objects
});
// Expose for monitoring
app.get('/api/health', (req, res) => {
const memory = process.memoryUsage();
res.json({
uptime: process.uptime(),
memory: {
rss: memory.rss,
heapUsed: memory.heapUsed,
heapTotal: memory.heapTotal
}
});
});
Caching Strategies
In-Memory Cache with node-cache
In-memory caching stores frequently-accessed data directly in your Node.js process memory, eliminating the round-trip to your database entirely. It is like keeping a cheat sheet on your desk instead of walking to the filing cabinet every time. The trade-off: the cache disappears when the process restarts, and each server instance has its own separate cache (unlike Redis, which is shared).
const NodeCache = require('node-cache');
const cache = new NodeCache({
stdTTL: 300, // Default TTL: 5 minutes (data expires after this)
checkperiod: 60, // Garbage collection interval -- checks for expired keys every 60s
maxKeys: 1000 // Limit cache size to prevent unbounded memory growth
});
// Cache middleware
const cacheMiddleware = (duration) => (req, res, next) => {
const key = `__express__${req.originalUrl}`;
const cached = cache.get(key);
if (cached) {
return res.json(cached);
}
// Override res.json to cache the response
const originalJson = res.json.bind(res);
res.json = (data) => {
cache.set(key, data, duration);
return originalJson(data);
};
next();
};
// Usage
app.get('/api/products', cacheMiddleware(300), async (req, res) => {
const products = await Product.find();
res.json(products);
});
// Manual cache operations
cache.set('user:123', userData, 3600); // 1 hour
const user = cache.get('user:123');
cache.del('user:123');
cache.flushAll();
Redis Cache
Redis is an in-memory data store that runs as a separate server process. Unlike node-cache (which is local to a single process), Redis is shared across all your application instances. If you run 4 Node.js workers behind a load balancer, all 4 workers read from and write to the same Redis cache. This makes it the standard choice for production caching in any multi-process or multi-server setup.
const { createClient } = require('redis');
const redis = createClient({
url: process.env.REDIS_URL || 'redis://localhost:6379'
});
// redis v4+ requires explicit connect() -- it does not auto-connect
redis.connect();
// Cache wrapper function
const cacheWrapper = async (key, ttl, fetchFn) => {
// Try to get from cache
const cached = await redis.get(key);
if (cached) {
return JSON.parse(cached);
}
// Fetch fresh data
const data = await fetchFn();
// Store in cache
await redis.setEx(key, ttl, JSON.stringify(data));
return data;
};
// Usage
app.get('/api/users/:id', async (req, res) => {
const user = await cacheWrapper(
`user:${req.params.id}`,
3600, // 1 hour
() => User.findById(req.params.id)
);
res.json(user);
});
// Cache invalidation
const invalidateUserCache = async (userId) => {
await redis.del(`user:${userId}`);
await redis.del('users:all');
};
// Pattern-based invalidation
const invalidatePattern = async (pattern) => {
const keys = await redis.keys(pattern);
if (keys.length > 0) {
await redis.del(keys);
}
};
// Usage: invalidatePattern('users:*');
Cache-Aside Pattern
The cache-aside pattern (also called “lazy loading”) is the most common caching strategy. The application checks the cache first; on a miss, it fetches from the database, stores the result in the cache, and returns it. The cache is only populated with data that is actually requested, so you do not waste memory caching data nobody reads.
The downside: the first request after a cache miss or expiration is slow. For hot-path data where even occasional slow responses are unacceptable, consider “write-through” caching (updating the cache immediately on every write).
class CacheService {
constructor(redis, defaultTTL = 3600) {
this.redis = redis;
this.defaultTTL = defaultTTL;
}
async get(key) {
const data = await this.redis.get(key);
return data ? JSON.parse(data) : null;
}
async set(key, value, ttl = this.defaultTTL) {
await this.redis.setEx(key, ttl, JSON.stringify(value));
}
async getOrSet(key, fetchFn, ttl = this.defaultTTL) {
let data = await this.get(key);
if (data === null) {
data = await fetchFn();
if (data !== null && data !== undefined) {
await this.set(key, data, ttl);
}
}
return data;
}
async invalidate(key) {
await this.redis.del(key);
}
async invalidateMany(keys) {
if (keys.length > 0) {
await this.redis.del(keys);
}
}
}
// Repository with caching
class UserRepository {
constructor(cacheService) {
this.cache = cacheService;
}
async findById(id) {
return this.cache.getOrSet(
`user:${id}`,
() => User.findById(id).lean(),
3600
);
}
async findAll() {
return this.cache.getOrSet(
'users:all',
() => User.find().lean(),
300
);
}
async update(id, data) {
const user = await User.findByIdAndUpdate(id, data, { new: true });
await this.cache.invalidate(`user:${id}`);
await this.cache.invalidate('users:all');
return user;
}
}
HTTP Caching
Server-side caching (Redis, node-cache) reduces database load, but the request still hits your server. HTTP caching goes one step further by telling the browser or CDN to cache responses, so repeat requests never reach your server at all. This is the most impactful performance optimization for read-heavy APIs.
There are two mechanisms: ETags (the server sends a fingerprint of the response; the client sends it back on subsequent requests, and the server responds with “304 Not Modified” if the data has not changed) and Cache-Control headers (the server tells the client how long the response is valid, so the client does not even make a request).
// ETags for conditional requests
const etag = require('etag');
app.get('/api/data', async (req, res) => {
const data = await fetchData();
const dataEtag = etag(JSON.stringify(data));
// Check If-None-Match header
if (req.headers['if-none-match'] === dataEtag) {
return res.status(304).end(); // Not Modified
}
res.set('ETag', dataEtag);
res.set('Cache-Control', 'private, max-age=300');
res.json(data);
});
// Cache-Control headers
app.get('/api/static-data', (req, res) => {
res.set('Cache-Control', 'public, max-age=86400'); // 24 hours
res.json(staticData);
});
// No cache for dynamic data
app.get('/api/user/profile', auth, (req, res) => {
res.set('Cache-Control', 'no-store');
res.json(req.user);
});
Clustering
Node.js runs on a single thread, which means by default it uses only one CPU core, even if your server has 8 or 16 cores. The cluster module lets you fork multiple copies of your application, each running on its own core, all sharing the same network port. It is like opening multiple checkout lanes at a grocery store — same store, more throughput.
The OS kernel handles distributing incoming connections across workers using a round-robin strategy (on most platforms), so you do not need a separate load balancer for single-machine scaling.
const cluster = require('cluster');
const numCPUs = require('os').cpus().length;
if (cluster.isPrimary) {
console.log(`Primary ${process.pid} is running`);
console.log(`Forking ${numCPUs} workers...`);
// Fork workers
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
// Handle worker death
cluster.on('exit', (worker, code, signal) => {
console.log(`Worker ${worker.process.pid} died. Restarting...`);
cluster.fork();
});
} else {
// Workers share the TCP connection
const app = require('./app');
app.listen(3000, () => {
console.log(`Worker ${process.pid} started`);
});
}
PM2 Cluster Mode (Recommended)
# Start in cluster mode
pm2 start app.js -i max # Use all CPUs
pm2 start app.js -i 4 # Use 4 instances
# Zero-downtime reload
pm2 reload app
# Monitoring
pm2 monit
Database Optimization
Connection Pooling
// PostgreSQL with pg
const { Pool } = require('pg');
const pool = new Pool({
host: process.env.DB_HOST,
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
max: 20, // Max connections
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
// Mongoose connection pooling
mongoose.connect(process.env.MONGO_URI, {
maxPoolSize: 10, // Max connections
minPoolSize: 2, // Min connections
serverSelectionTimeoutMS: 5000,
socketTimeoutMS: 45000,
});
Query Optimization
The most common database performance issue in web applications is the N+1 query problem. It happens when you fetch N items from one table, then make a separate query for each item to fetch related data — resulting in N+1 total queries where a single query would suffice. With 100 users, that is 101 database round-trips instead of 1 or 2.
// ❌ N+1 query problem -- 1 query for users + N queries for posts
const users = await User.find();
for (const user of users) {
const posts = await Post.find({ userId: user._id }); // Fires once PER user!
}
// ✅ Use population
const users = await User.find().populate('posts');
// ✅ Or use aggregation
const usersWithPosts = await User.aggregate([
{
$lookup: {
from: 'posts',
localField: '_id',
foreignField: 'userId',
as: 'posts'
}
}
]);
// ✅ Pagination
const page = parseInt(req.query.page) || 1;
const limit = parseInt(req.query.limit) || 10;
const users = await User.find()
.skip((page - 1) * limit)
.limit(limit)
.lean(); // Return plain objects (faster)
// ✅ Select only needed fields
const users = await User.find()
.select('name email')
.lean();
// ✅ Use indexes
// In schema definition
UserSchema.index({ email: 1 }, { unique: true });
UserSchema.index({ createdAt: -1 });
UserSchema.index({ firstName: 1, lastName: 1 }); // Compound index
Response Compression
Compression reduces the size of HTTP responses by 60-80% for text-based content (JSON, HTML, CSS, JS). The server compresses the response body before sending it, and the browser decompresses it automatically. For a 500KB JSON response, compression can reduce the transfer to ~100KB — a significant difference for users on slow connections and for your bandwidth bill.
The trade-off is CPU time on the server. For most API servers the overhead is negligible, but for very high-throughput services you may want to offload compression to a reverse proxy (Nginx, Cloudflare) instead.
const compression = require('compression');
app.use(compression({
level: 6, // Compression level (1-9): 1 = fastest, 9 = smallest, 6 = good balance
threshold: 1024, // Only compress responses larger than 1KB (small responses are not worth it)
filter: (req, res) => {
// Allow clients to opt out (useful for debugging or streaming)
if (req.headers['x-no-compression']) {
return false;
}
return compression.filter(req, res);
}
}));
Async Optimization
One of the most common performance mistakes in Node.js is making async calls sequentially when they could run in parallel. If getUser takes 100ms, getPosts takes 150ms, and getComments takes 80ms, running them sequentially costs 330ms. Running them in parallel costs only 150ms (the slowest one). That is a 55% speedup for changing three lines of code.
The rule of thumb: if two async operations do not depend on each other’s results, run them in parallel with Promise.all().
// ❌ Sequential (slow) -- each await pauses until the previous one finishes.
// Total time: 100ms + 150ms + 80ms = 330ms
const user = await getUser(id);
const posts = await getPosts(id);
const comments = await getComments(id);
// ✅ Parallel (fast) -- all three requests fire simultaneously.
// Total time: max(100ms, 150ms, 80ms) = 150ms
const [user, posts, comments] = await Promise.all([
getUser(id),
getPosts(id),
getComments(id)
]);
// ✅ With error handling
const results = await Promise.allSettled([
getUser(id),
getPosts(id),
getComments(id)
]);
const data = results.map((result, index) => {
if (result.status === 'fulfilled') {
return result.value;
}
console.error(`Request ${index} failed:`, result.reason);
return null;
});
Memory Management
Memory leaks are the silent killer of Node.js applications. They do not crash your app immediately — they slowly consume more and more memory until the process runs out, triggers garbage collection pauses that freeze all requests, and eventually crashes with an out-of-memory error (usually at 3 AM on a Saturday). The most common cause is unbounded data structures that grow forever.
// ❌ Memory leak - unbounded array that grows with every request
// After 10,000 requests with 1MB payloads, this consumes 10GB of RAM
const cache = [];
app.get('/data', (req, res) => {
const data = fetchLargeData();
cache.push(data); // Never cleared -- grows forever until OOM crash!
res.json(data);
});
// ✅ Use LRU cache with size limit
const LRU = require('lru-cache');
const cache = new LRU({
max: 100, // Max items
maxSize: 50 * 1024 * 1024, // 50MB
sizeCalculation: (value) => JSON.stringify(value).length,
ttl: 1000 * 60 * 5 // 5 minutes
});
// ✅ Stream large data instead of loading into memory
app.get('/download', (req, res) => {
const stream = fs.createReadStream('large-file.csv');
stream.pipe(res);
});
// ✅ Process large datasets in chunks
async function processLargeDataset() {
const cursor = Model.find().cursor();
for await (const doc of cursor) {
await processDocument(doc);
}
}
Summary
- Measure first - Use profiling tools before optimizing
- Cache aggressively - Redis for distributed, in-memory for local
- Use connection pooling - Don’t create new connections per request
- Optimize queries - Avoid N+1, use indexes, select only needed fields
- Enable compression - Reduce response sizes
- Run parallel operations - Use Promise.all when possible
- Cluster for CPU scaling - Use PM2 in cluster mode
- Stream large data - Don’t load everything into memory
- Monitor continuously - Track performance metrics in production