Express Horizontal Scaling

Introduction

As your web application grows in popularity, a single server instance may not be able to handle all incoming traffic effectively. This is where horizontal scaling comes into play - a strategy that involves adding more server instances to distribute the load rather than upgrading a single server (vertical scaling).

In this guide, we'll explore how to horizontally scale Express.js applications to improve performance, increase reliability, and handle growing user demands efficiently.

What is Horizontal Scaling?

Horizontal scaling (or "scaling out") refers to adding more machines to your resource pool instead of upgrading existing ones. Think of it as adding more workers to a factory line rather than making one worker faster.

Horizontal vs. Vertical Scaling

Before diving deeper, let's understand the difference:

Horizontal Scaling	Vertical Scaling
Adding more machines	Upgrading existing machine
Virtually unlimited scaling potential	Limited by hardware maximums
Better fault tolerance	Single point of failure
Requires load balancing	Simpler to implement
Can be more cost-effective long-term	Often more expensive at scale

Prerequisites for Horizontal Scaling

To implement horizontal scaling for your Express application, make sure you have:

A stateless application design
A load balancer
A strategy for session management
Database scaling considerations

Implementing Horizontal Scaling with Express

Step 1: Ensure Your Application is Stateless

For effective horizontal scaling, your Express application should be stateless, meaning no important data is stored in memory on a specific server instance.

❌ Problematic stateful code:

// DON'T DO THIS in a horizontally scaled environment
const users = {}; // In-memory user store

app.post('/login', (req, res) => {
  // Store user in memory
  users[req.body.userId] = { 
    name: req.body.name,
    loggedIn: true 
  };
  res.send('Logged in');
});

app.get('/user/:id', (req, res) => {
  // This will only work if the request hits the same server
  const user = users[req.params.id];
  res.json(user || { error: 'User not found' });
});

✅ Better stateless approach:

// Use a shared data store like Redis or a database
const redis = require('redis');
const client = redis.createClient(process.env.REDIS_URL);

app.post('/login', async (req, res) => {
  // Store user in Redis
  await client.set(`user:${req.body.userId}`, JSON.stringify({
    name: req.body.name,
    loggedIn: true
  }));
  res.send('Logged in');
});

app.get('/user/:id', async (req, res) => {
  // This works regardless of which server handles the request
  const userData = await client.get(`user:${req.params.id}`);
  if (!userData) {
    return res.json({ error: 'User not found' });
  }
  res.json(JSON.parse(userData));
});

Step 2: Set Up Session Management

If your application uses sessions, you can't rely on in-memory session storage. Instead, use a shared session store:

const express = require('express');
const session = require('express-session');
const RedisStore = require('connect-redis').default;
const redis = require('redis');

const app = express();
const redisClient = redis.createClient({
  url: process.env.REDIS_URL
});

// Initialize RedisStore with the client
const redisStore = new RedisStore({
  client: redisClient
});

// Set up session middleware with Redis store
app.use(session({
  store: redisStore,
  secret: 'your-secret-key',
  resave: false,
  saveUninitialized: false,
  cookie: { secure: process.env.NODE_ENV === 'production' }
}));

redisClient.connect().catch(console.error);

Step 3: Set Up Clustering with PM2

Node.js is single-threaded, but you can use clustering to leverage multi-core systems. PM2 is a process manager that simplifies this:

First, install PM2:

npm install pm2 -g

Create an ecosystem.config.js file:

module.exports = {
  apps: [{
    name: "express-app",
    script: "app.js",
    instances: "max", // Use maximum available CPU cores
    exec_mode: "cluster",
    env: {
      NODE_ENV: "development",
    },
    env_production: {
      NODE_ENV: "production",
    }
  }]
};

Start your application with PM2:

pm2 start ecosystem.config.js --env production

Step 4: Implement a Load Balancer

For cloud-hosted applications, you can use services like:

AWS Elastic Load Balancer
Google Cloud Load Balancing
Nginx

For a simple Nginx load balancer configuration:

http {
  upstream express_app {
    server 127.0.0.1:3000;
    server 127.0.0.1:3001;
    server 127.0.0.1:3002;
    # Add more servers as needed
  }

  server {
    listen 80;
    location / {
      proxy_pass http://express_app;
      proxy_http_version 1.1;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection 'upgrade';
      proxy_set_header Host $host;
      proxy_cache_bypass $http_upgrade;
    }
  }
}

Real-World Example: Scaling a User Authentication Service

Let's build a simplified authentication service that can be horizontally scaled:

const express = require('express');
const redis = require('redis');
const { v4: uuidv4 } = require('uuid');
const bcrypt = require('bcrypt');

// Initialize Express
const app = express();
app.use(express.json());

// Initialize Redis
const redisClient = redis.createClient({
  url: process.env.REDIS_URL || 'redis://localhost:6379'
});
redisClient.connect().catch(console.error);

// Routes
app.post('/register', async (req, res) => {
  try {
    const { username, password } = req.body;
    
    // Check if user exists
    const existingUser = await redisClient.get(`user:${username}`);
    if (existingUser) {
      return res.status(409).json({ error: 'Username already exists' });
    }
    
    // Hash password and store user
    const hashedPassword = await bcrypt.hash(password, 10);
    await redisClient.set(`user:${username}`, JSON.stringify({
      username,
      password: hashedPassword
    }));
    
    res.status(201).json({ message: 'User created successfully' });
  } catch (error) {
    res.status(500).json({ error: 'Internal server error' });
  }
});

app.post('/login', async (req, res) => {
  try {
    const { username, password } = req.body;
    
    // Get user
    const userData = await redisClient.get(`user:${username}`);
    if (!userData) {
      return res.status(401).json({ error: 'Invalid credentials' });
    }
    
    const user = JSON.parse(userData);
    
    // Verify password
    const passwordMatch = await bcrypt.compare(password, user.password);
    if (!passwordMatch) {
      return res.status(401).json({ error: 'Invalid credentials' });
    }
    
    // Create session
    const sessionId = uuidv4();
    await redisClient.set(`session:${sessionId}`, username, {
      EX: 86400 // Expire in 24 hours
    });
    
    res.status(200).json({ token: sessionId });
  } catch (error) {
    res.status(500).json({ error: 'Internal server error' });
  }
});

app.get('/me', async (req, res) => {
  try {
    const token = req.headers.authorization?.split(' ')[1];
    if (!token) {
      return res.status(401).json({ error: 'Unauthorized' });
    }
    
    const username = await redisClient.get(`session:${token}`);
    if (!username) {
      return res.status(401).json({ error: 'Unauthorized' });
    }
    
    const userData = await redisClient.get(`user:${username}`);
    const user = JSON.parse(userData);
    
    // Don't return the password
    delete user.password;
    
    res.status(200).json(user);
  } catch (error) {
    res.status(500).json({ error: 'Internal server error' });
  }
});

// Start server
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`);
  console.log(`Worker PID: ${process.pid}`);
});

This example:

Uses Redis to store user data and sessions
Implements basic authentication endpoints
Provides proper error handling
Doesn't rely on in-memory storage
Is completely stateless and can be scaled horizontally

Best Practices for Scaling Express Applications

Use a process manager like PM2 to manage and monitor your application instances.
Implement health checks to ensure load balancers don't route traffic to unhealthy instances:

app.get('/health', (req, res) => {
  res.status(200).send('OK');
});

Use a shared cache like Redis for frequently accessed data:

app.get('/products/:id', async (req, res) => {
  const productId = req.params.id;
  
  // Try to get product from cache
  const cachedProduct = await redisClient.get(`product:${productId}`);
  if (cachedProduct) {
    return res.json(JSON.parse(cachedProduct));
  }
  
  // If not in cache, fetch from database
  const product = await db.products.findById(productId);
  
  // Store in cache for 10 minutes
  await redisClient.set(`product:${productId}`, JSON.stringify(product), {
    EX: 600
  });
  
  res.json(product);
});

Implement rate limiting to prevent abuse:

const rateLimit = require('express-rate-limit');

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // limit each IP to 100 requests per windowMs
  standardHeaders: true,
  store: new RedisStore({
    client: redisClient,
    prefix: 'rate-limit:'
  })
});

app.use(limiter);

Optimize database queries and consider database scaling strategies like replication or sharding.
Implement a robust logging system that centralizes logs from all instances.

Common Challenges and Solutions

Challenge: Inconsistent File Uploads

If your application allows file uploads, files saved to local disk won't be available across instances.

Solution: Use cloud storage like AWS S3, Google Cloud Storage, or Azure Blob Storage.

const { S3Client, PutObjectCommand } = require('@aws-sdk/client-s3');
const multer = require('multer');
const upload = multer({ storage: multer.memoryStorage() });

const s3Client = new S3Client({
  region: process.env.AWS_REGION
});

app.post('/upload', upload.single('file'), async (req, res) => {
  try {
    const file = req.file;
    const key = `uploads/${Date.now()}-${file.originalname}`;
    
    const command = new PutObjectCommand({
      Bucket: process.env.S3_BUCKET,
      Key: key,
      Body: file.buffer,
      ContentType: file.mimetype
    });
    
    await s3Client.send(command);
    
    const fileUrl = `https://${process.env.S3_BUCKET}.s3.${process.env.AWS_REGION}.amazonaws.com/${key}`;
    res.json({ url: fileUrl });
  } catch (error) {
    res.status(500).json({ error: 'Upload failed' });
  }
});

Challenge: Websocket Connections

Websocket connections are stateful and challenging in a horizontally scaled environment.

Solution: Use a websocket service like Socket.IO with Redis adapter:

const express = require('express');
const { createServer } = require('http');
const { Server } = require('socket.io');
const { createAdapter } = require('@socket.io/redis-adapter');
const redis = require('redis');

const app = express();
const httpServer = createServer(app);
const io = new Server(httpServer);

// Set up Redis publisher and subscriber
const pubClient = redis.createClient({ url: process.env.REDIS_URL });
const subClient = pubClient.duplicate();

Promise.all([
  pubClient.connect(),
  subClient.connect()
]).then(() => {
  io.adapter(createAdapter(pubClient, subClient));
  
  io.on('connection', (socket) => {
    console.log('Client connected', socket.id);
    
    socket.on('message', (data) => {
      // Broadcast to all clients across all server instances
      io.emit('message', data);
    });
  });
  
  httpServer.listen(3000);
});

Summary

Horizontally scaling Express applications involves:

Making your application stateless
Using shared stores for sessions and data
Implementing proper load balancing
Using process managers like PM2
Optimizing database access and file handling

When properly implemented, horizontal scaling provides:

Higher availability and fault tolerance
Improved performance under heavy load
Flexible scaling based on demand
Better resource utilization

Additional Resources

Exercises

Convert a simple Express application that uses in-memory sessions to use Redis sessions.
Set up a local environment with PM2 running multiple instances of your Express app.
Implement a basic load test to verify your application scales horizontally.
Create a Docker Compose setup with multiple Express containers and an NGINX load balancer.
Add centralized logging to your horizontally scaled application.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

What is Horizontal Scaling?​

Horizontal vs. Vertical Scaling​

Prerequisites for Horizontal Scaling​

Implementing Horizontal Scaling with Express​

Step 1: Ensure Your Application is Stateless​

Step 2: Set Up Session Management​

Step 3: Set Up Clustering with PM2​

Step 4: Implement a Load Balancer​

Real-World Example: Scaling a User Authentication Service​

Best Practices for Scaling Express Applications​

Common Challenges and Solutions​

Challenge: Inconsistent File Uploads​

Challenge: Websocket Connections​

Summary​

Additional Resources​

Exercises​