Express File Streams

When working with large files in Express applications, using streams provides significant performance benefits and memory efficiency. Unlike loading entire files into memory, streams allow you to process data in chunks, making your application more scalable.

What are File Streams?

In Node.js and Express, streams are objects that let you read data from a source or write data to a destination continuously. Think of them as channels where data flows piece by piece, rather than being loaded all at once.

For file operations, streams are particularly valuable when:

Processing large files that would consume too much memory if loaded entirely
Building real-time applications where data needs to be processed as it arrives
Creating efficient APIs that transfer files between clients and servers

Types of Streams in Node.js

Before diving into Express implementations, let's understand the four fundamental types of streams:

Readable - Sources from which data can be consumed (e.g., reading a file)
Writable - Destinations to which data can be written (e.g., writing to a file)
Duplex - Both readable and writable (e.g., network sockets)
Transform - Duplex streams that can modify data as it's written or read (e.g., compression)

Setting Up Express for File Streaming

First, let's create a basic Express application that we'll use to demonstrate file streaming:

javascript
const express = require('express');
const fs = require('fs');
const path = require('path');

const app = express();
const port = 3000;

// Basic middleware
app.use(express.json());

app.listen(port, () => {
  console.log(`Server running on port ${port}`);
});

Streaming Files for Download

One of the most common use cases for streams in Express is sending files to clients. Instead of loading the entire file into memory, we can stream it directly:

javascript
// Streaming a file download
app.get('/download/:filename', (req, res) => {
  const filename = req.params.filename;
  const filePath = path.join(__dirname, 'files', filename);
  
  // Check if file exists
  fs.access(filePath, fs.constants.F_OK, (err) => {
    if (err) {
      return res.status(404).send('File not found');
    }
    
    // Get file stats (including size)
    fs.stat(filePath, (err, stats) => {
      if (err) {
        return res.status(500).send('Error accessing file');
      }
      
      // Set appropriate headers
      res.setHeader('Content-Length', stats.size);
      res.setHeader('Content-Type', 'application/octet-stream');
      res.setHeader('Content-Disposition', `attachment; filename="${filename}"`);
      
      // Create read stream and pipe to response
      const fileStream = fs.createReadStream(filePath);
      fileStream.pipe(res);
      
      // Handle stream errors
      fileStream.on('error', (error) => {
        console.error('Stream error:', error);
        res.status(500).end('File stream error');
      });
    });
  });
});

How This Works:

We create a route that accepts a filename parameter
We check if the file exists using fs.access()
We get the file's information using fs.stat()
We set appropriate HTTP headers for the download
We create a readable stream from the file using fs.createReadStream()
We pipe that stream directly to the response object
We add error handling to manage any streaming issues

This approach never loads the entire file into memory, making it efficient even for very large files.

Uploading Files with Streams

For file uploads, we can use a streaming approach with libraries like multer with disk storage or implement manual streaming:

javascript
const multer = require('multer');
const storage = multer.diskStorage({
  destination: (req, file, cb) => {
    cb(null, path.join(__dirname, 'uploads'));
  },
  filename: (req, file, cb) => {
    cb(null, Date.now() + '-' + file.originalname);
  }
});

const upload = multer({ storage });

app.post('/upload', upload.single('file'), (req, res) => {
  if (!req.file) {
    return res.status(400).send('No file uploaded');
  }
  
  res.send({
    message: 'File uploaded successfully',
    filename: req.file.filename,
    size: req.file.size
  });
});

Manual File Upload with Streams

For more control, you might want to handle the streaming manually:

javascript
const fs = require('fs');
const path = require('path');
const busboy = require('busboy');

app.post('/upload-stream', (req, res) => {
  // Create busboy instance with request headers
  const bb = busboy({ headers: req.headers });
  
  // Handle file stream
  bb.on('file', (fieldname, fileStream, filename, encoding, mimetype) => {
    console.log(`Processing upload: ${filename.filename}`);
    
    // Create write stream
    const saveTo = path.join(__dirname, 'uploads', filename.filename);
    const writeStream = fs.createWriteStream(saveTo);
    
    // Pipe file data to write stream
    fileStream.pipe(writeStream);
    
    // Handle completion
    fileStream.on('end', () => {
      console.log(`Upload of ${filename.filename} completed`);
    });
    
    // Handle write stream completion
    writeStream.on('close', () => {
      console.log(`File saved: ${saveTo}`);
    });
  });
  
  // Handle form field data
  bb.on('field', (fieldname, val) => {
    console.log(`Field [${fieldname}]: value: ${val}`);
  });
  
  // Handle upload completion
  bb.on('finish', () => {
    res.send('Upload processed successfully');
  });
  
  // Pipe request to busboy for processing
  req.pipe(bb);
});

Video Streaming Example

A classic use case for streams is video streaming. Here's how to implement a simple video stream endpoint:

javascript
app.get('/stream/video/:filename', (req, res) => {
  const filename = req.params.filename;
  const videoPath = path.join(__dirname, 'videos', filename);
  
  // Check if file exists
  fs.access(videoPath, fs.constants.F_OK, (err) => {
    if (err) {
      return res.status(404).send('Video not found');
    }
    
    // Get video stats
    const stat = fs.statSync(videoPath);
    const fileSize = stat.size;
    const range = req.headers.range;
    
    // Handle range request (partial content)
    if (range) {
      // Parse range
      const parts = range.replace(/bytes=/, '').split('-');
      const start = parseInt(parts[0], 10);
      const end = parts[1] ? parseInt(parts[1], 10) : fileSize - 1;
      const chunkSize = (end - start) + 1;
      
      // Create read stream for the specific range
      const stream = fs.createReadStream(videoPath, { start, end });
      
      // Set headers for range response
      res.writeHead(206, {
        'Content-Range': `bytes ${start}-${end}/${fileSize}`,
        'Accept-Ranges': 'bytes',
        'Content-Length': chunkSize,
        'Content-Type': 'video/mp4',
      });
      
      // Pipe the video chunk
      stream.pipe(res);
    } 
    // Handle full video request
    else {
      // Set headers for full response
      res.writeHead(200, {
        'Content-Length': fileSize,
        'Content-Type': 'video/mp4',
      });
      
      // Stream the full video
      fs.createReadStream(videoPath).pipe(res);
    }
  });
});

This streaming implementation supports both full video and range requests (partial content), which is essential for video players that allow seeking to different positions.

Transform Streams: Processing Data On-the-fly

Transform streams are powerful for modifying data as it flows. Here's an example that converts a text file to uppercase while streaming:

javascript
const { Transform } = require('stream');

// Create uppercase transform stream
const upperCaseTransform = new Transform({
  transform(chunk, encoding, callback) {
    // Convert buffer chunk to string, uppercase it, then back to buffer
    const upperChunk = chunk.toString().toUpperCase();
    this.push(Buffer.from(upperChunk));
    callback();
  }
});

app.get('/uppercase/:filename', (req, res) => {
  const filename = req.params.filename;
  const filePath = path.join(__dirname, 'files', filename);
  
  fs.access(filePath, fs.constants.F_OK, (err) => {
    if (err) {
      return res.status(404).send('File not found');
    }
    
    res.setHeader('Content-Type', 'text/plain');
    res.setHeader('Content-Disposition', `attachment; filename="uppercase-${filename}"`);
    
    // Create the pipeline: read file -> transform to uppercase -> send response
    const readStream = fs.createReadStream(filePath);
    readStream
      .pipe(upperCaseTransform)
      .pipe(res);
      
    readStream.on('error', (error) => {
      console.error('Stream error:', error);
      res.status(500).end('File stream error');
    });
  });
});

Error Handling in Streams

Proper error handling is crucial when working with streams. Here's a more complete example showing how to handle various stream errors:

javascript
app.get('/download-safe/:filename', (req, res) => {
  const filename = req.params.filename;
  const filePath = path.join(__dirname, 'files', filename);
  
  const readStream = fs.createReadStream(filePath);
  
  // Set content headers
  res.setHeader('Content-Disposition', `attachment; filename="${filename}"`);
  res.setHeader('Content-Type', 'application/octet-stream');
  
  // Pipe the file stream to response
  readStream.pipe(res);
  
  // Handle stream errors
  readStream.on('error', (error) => {
    console.error('Stream error:', error);
    
    // Check if headers have been sent
    if (!res.headersSent) {
      if (error.code === 'ENOENT') {
        return res.status(404).send('File not found');
      } else {
        return res.status(500).send('Internal server error');
      }
    } else {
      // If headers already sent, we need to close the connection
      res.end();
    }
  });
  
  // Handle client disconnect
  req.on('close', () => {
    readStream.destroy(); // Clean up the stream
    console.log('Client disconnected, stream destroyed');
  });
});

Streaming Large CSV Data Processing

Here's a practical example of streaming a large CSV file for processing:

javascript
const csv = require('csv-parser');

app.get('/process-csv/:filename', (req, res) => {
  const filename = req.params.filename;
  const filePath = path.join(__dirname, 'data', filename);
  
  const results = [];
  let rowCount = 0;
  
  res.setHeader('Content-Type', 'application/json');
  
  // Create readable stream
  const readStream = fs.createReadStream(filePath)
    .on('error', (error) => {
      if (error.code === 'ENOENT') {
        return res.status(404).send({ error: 'CSV file not found' });
      }
      return res.status(500).send({ error: 'Error reading CSV file' });
    });
  
  // Process the CSV stream
  readStream
    .pipe(csv())
    .on('data', (data) => {
      rowCount++;
      // For very large files, we might just want to count or process
      // without storing everything in memory
      if (results.length < 100) { // Store only first 100 rows for preview
        results.push(data);
      }
    })
    .on('end', () => {
      res.send({
        totalRows: rowCount,
        preview: results
      });
    })
    .on('error', (error) => {
      console.error('CSV parsing error:', error);
      res.status(500).send({ error: 'Error parsing CSV data' });
    });
});

Performance Considerations

When implementing file streams in Express, remember these best practices:

Stream backpressure: Ensure your streams handle backpressure properly (when the destination can't process data as fast as the source produces it)
Memory usage: Monitor memory usage during stream operations
Stream cleanup: Always destroy streams when errors occur or connections close
Chunk size: For custom stream implementations, choose appropriate chunk sizes for your use case
Error handling: Implement comprehensive error handling for all stream events

Summary

File streams in Express provide an efficient way to handle large files without overwhelming your server's memory. By processing data in chunks, you can build more scalable applications capable of handling files of any size.

We've covered:

Basic file streaming concepts
Implementing file downloads with streams
File uploads using streams
Video streaming with range support
Transform streams for on-the-fly data processing
Error handling patterns for robust applications
Real-world examples like CSV processing

Additional Resources

Exercises

Implement a file compression endpoint that uses streams to compress files on-the-fly using the zlib module
Create a streaming image resizing service using the Sharp library
Build a log file analyzer that streams large log files and extracts specific patterns
Implement a stream-based file encryption/decryption service

By mastering file streams in Express, you'll be able to create high-performance applications that efficiently handle data of any size while maintaining excellent memory usage patterns.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

What are File Streams?​

Types of Streams in Node.js​

Setting Up Express for File Streaming​

Streaming Files for Download​

How This Works:​

Uploading Files with Streams​

Manual File Upload with Streams​

Video Streaming Example​

Transform Streams: Processing Data On-the-fly​

Error Handling in Streams​

Streaming Large CSV Data Processing​

Performance Considerations​

Summary​

Additional Resources​

Exercises​