Skip to main content

.NET File Compression

Introduction

File compression is an essential technique in modern applications that helps reduce storage requirements and improve data transfer efficiency. In .NET, the framework provides robust built-in libraries for compressing and decompressing files with minimal effort.

File compression works by reducing redundancy in data, ultimately creating smaller files that preserve all original information. This is particularly useful when:

  • Storing large amounts of data
  • Transferring files over networks
  • Creating archives of multiple files
  • Reducing application size for distribution

In this tutorial, we'll explore how to use .NET's System.IO.Compression namespace to work with compressed files in various formats like GZip, Deflate, and ZIP.

Prerequisites

Before diving into file compression, ensure you have:

  • Basic knowledge of C# and .NET
  • .NET SDK installed on your system
  • Understanding of basic file operations in .NET

Understanding Compression in .NET

The .NET Framework and .NET Core both provide the System.IO.Compression namespace, which contains classes for compressing and decompressing files and streams. The most commonly used compression algorithms available are:

  1. GZip - A popular compression format based on the DEFLATE algorithm
  2. Deflate - A compression algorithm that combines LZ77 and Huffman coding
  3. ZIP - An archive format that can contain multiple compressed files

Let's explore each of these techniques.

Compressing and Decompressing with GZip

GZip is one of the most widely used compression formats, particularly for single files.

Compressing Files with GZip

csharp
using System;
using System.IO;
using System.IO.Compression;

public class GZipCompression
{
public static void CompressFile(string sourceFile, string destinationFile)
{
try
{
using (FileStream originalFileStream = File.Open(sourceFile, FileMode.Open))
{
using (FileStream compressedFileStream = File.Create(destinationFile))
{
using (GZipStream compressionStream = new GZipStream(compressedFileStream, CompressionMode.Compress))
{
originalFileStream.CopyTo(compressionStream);
Console.WriteLine($"Compressed {sourceFile} to {destinationFile}");
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Error compressing file: {ex.Message}");
}
}
}

Decompressing GZip Files

csharp
public static void DecompressFile(string compressedFile, string destinationFile)
{
try
{
using (FileStream compressedFileStream = File.Open(compressedFile, FileMode.Open))
{
using (FileStream outputFileStream = File.Create(destinationFile))
{
using (GZipStream decompressionStream = new GZipStream(compressedFileStream, CompressionMode.Decompress))
{
decompressionStream.CopyTo(outputFileStream);
Console.WriteLine($"Decompressed {compressedFile} to {destinationFile}");
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Error decompressing file: {ex.Message}");
}
}

Example Usage

csharp
// Compress a text file
GZipCompression.CompressFile("sample.txt", "sample.txt.gz");

// Decompress the file
GZipCompression.DecompressFile("sample.txt.gz", "sample_decompressed.txt");

Working with Deflate Compression

Deflate is the algorithm underlying GZip, but with less overhead. It's useful when you want basic compression without the GZip header information.

csharp
using System;
using System.IO;
using System.IO.Compression;

public class DeflateCompression
{
public static void CompressWithDeflate(string sourceFile, string destinationFile)
{
try
{
using (FileStream originalFileStream = File.Open(sourceFile, FileMode.Open))
{
using (FileStream compressedFileStream = File.Create(destinationFile))
{
using (DeflateStream deflateStream = new DeflateStream(compressedFileStream, CompressionMode.Compress))
{
originalFileStream.CopyTo(deflateStream);
Console.WriteLine($"Compressed {sourceFile} using Deflate");
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Compression error: {ex.Message}");
}
}

public static void DecompressWithDeflate(string compressedFile, string destinationFile)
{
try
{
using (FileStream compressedFileStream = File.Open(compressedFile, FileMode.Open))
{
using (FileStream outputFileStream = File.Create(destinationFile))
{
using (DeflateStream deflateStream = new DeflateStream(compressedFileStream, CompressionMode.Decompress))
{
deflateStream.CopyTo(outputFileStream);
Console.WriteLine($"Decompressed {compressedFile} using Deflate");
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Decompression error: {ex.Message}");
}
}
}

Creating and Extracting ZIP Archives

ZIP archives are particularly useful when you need to compress multiple files together.

Creating a ZIP Archive

csharp
using System;
using System.IO;
using System.IO.Compression;

public class ZipArchiveHelper
{
public static void CreateZipFromDirectory(string sourceDirectory, string zipFilePath)
{
try
{
// Create a zip archive from a directory
if (File.Exists(zipFilePath))
File.Delete(zipFilePath);

ZipFile.CreateFromDirectory(sourceDirectory, zipFilePath);
Console.WriteLine($"Created ZIP archive {zipFilePath} from directory {sourceDirectory}");
}
catch (Exception ex)
{
Console.WriteLine($"Error creating ZIP: {ex.Message}");
}
}

public static void ExtractZipToDirectory(string zipFilePath, string extractPath)
{
try
{
// Extract a zip archive to a directory
if (Directory.Exists(extractPath))
Directory.Delete(extractPath, true);

Directory.CreateDirectory(extractPath);
ZipFile.ExtractToDirectory(zipFilePath, extractPath);
Console.WriteLine($"Extracted ZIP archive {zipFilePath} to {extractPath}");
}
catch (Exception ex)
{
Console.WriteLine($"Error extracting ZIP: {ex.Message}");
}
}
}

Example Usage:

csharp
// Create a ZIP archive from a directory
ZipArchiveHelper.CreateZipFromDirectory("MyDocuments", "MyDocuments.zip");

// Extract a ZIP archive to a directory
ZipArchiveHelper.ExtractZipToDirectory("MyDocuments.zip", "ExtractedDocuments");

Working with ZIP Archives Programmatically

Sometimes you need more control over the ZIP creation process, like adding specific files or setting compression levels:

csharp
public static void CreateCustomZipArchive(string[] filesToAdd, string zipFilePath)
{
try
{
using (FileStream zipToCreate = new FileStream(zipFilePath, FileMode.Create))
{
using (ZipArchive archive = new ZipArchive(zipToCreate, ZipArchiveMode.Create))
{
foreach (string file in filesToAdd)
{
// Get the filename without the path
string fileName = Path.GetFileName(file);

// Add file to the archive
archive.CreateEntryFromFile(file, fileName);
Console.WriteLine($"Added {fileName} to ZIP archive");
}
}
}

Console.WriteLine($"Successfully created custom ZIP archive: {zipFilePath}");
}
catch (Exception ex)
{
Console.WriteLine($"Error creating custom ZIP: {ex.Message}");
}
}

Reading Files from ZIP Archives

csharp
public static void ReadFilesFromZip(string zipFilePath)
{
try
{
using (ZipArchive archive = ZipFile.OpenRead(zipFilePath))
{
Console.WriteLine($"Contents of ZIP archive {zipFilePath}:");
foreach (ZipArchiveEntry entry in archive.Entries)
{
Console.WriteLine($" - {entry.FullName} (Size: {entry.Length} bytes, Compressed: {entry.CompressedLength} bytes)");
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Error reading ZIP: {ex.Message}");
}
}

Setting Compression Levels

.NET allows you to control the level of compression, balancing between speed and efficiency:

csharp
public static void CompressWithCustomLevel(string sourceFile, string destinationFile, CompressionLevel level)
{
try
{
using (FileStream originalFileStream = File.Open(sourceFile, FileMode.Open))
{
using (FileStream compressedFileStream = File.Create(destinationFile))
{
// Available levels: Optimal, Fastest, NoCompression
using (GZipStream compressionStream = new GZipStream(
compressedFileStream,
level,
false))
{
originalFileStream.CopyTo(compressionStream);
Console.WriteLine($"Compressed {sourceFile} with {level} compression level");
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Compression error: {ex.Message}");
}
}

Example usage:

csharp
// Compress with different compression levels
CompressWithCustomLevel("largefile.dat", "optimal.gz", CompressionLevel.Optimal); // Best compression (slower)
CompressWithCustomLevel("largefile.dat", "fastest.gz", CompressionLevel.Fastest); // Fastest compression
CompressWithCustomLevel("largefile.dat", "none.gz", CompressionLevel.NoCompression); // No compression (store only)

Real-World Applications

Example 1: Log File Compression

A common scenario in applications is compressing log files to save disk space:

csharp
public static void CompressOldLogFiles(string logsDirectory, int daysThreshold)
{
try
{
// Get all .log files older than the threshold
var oldLogFiles = Directory.GetFiles(logsDirectory, "*.log")
.Select(path => new FileInfo(path))
.Where(file => file.LastWriteTime < DateTime.Now.AddDays(-daysThreshold))
.ToList();

foreach (var logFile in oldLogFiles)
{
string compressedPath = $"{logFile.FullName}.gz";

// Compress the log file
using (FileStream originalLogStream = logFile.OpenRead())
{
using (FileStream compressedStream = File.Create(compressedPath))
{
using (GZipStream gzStream = new GZipStream(compressedStream, CompressionLevel.Optimal))
{
originalLogStream.CopyTo(gzStream);
}
}
}

// Delete the original log file after successful compression
if (File.Exists(compressedPath))
{
File.Delete(logFile.FullName);
Console.WriteLine($"Compressed old log file: {logFile.Name}");
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Error compressing log files: {ex.Message}");
}
}

Example 2: Creating a Backup Archive

Here's how you might create a backup utility for a specific folder:

csharp
public static void CreateBackup(string sourceDirectory, string backupDestination)
{
try
{
// Create a timestamped backup file name
string timestamp = DateTime.Now.ToString("yyyy-MM-dd-HHmmss");
string backupFileName = $"Backup-{timestamp}.zip";
string backupPath = Path.Combine(backupDestination, backupFileName);

// Ensure the backup directory exists
Directory.CreateDirectory(backupDestination);

// Create the backup archive
ZipFile.CreateFromDirectory(
sourceDirectory,
backupPath,
CompressionLevel.Optimal,
includeBaseDirectory: false);

Console.WriteLine($"Backup created successfully: {backupPath}");
}
catch (Exception ex)
{
Console.WriteLine($"Backup failed: {ex.Message}");
}
}

Performance Considerations

When working with file compression in .NET, keep these performance tips in mind:

  1. Large files: Use buffered operations or Stream.CopyTo for large files to avoid loading the entire file into memory.

  2. Compression level: Choose the appropriate compression level based on your needs:

    • CompressionLevel.Optimal for best compression (slower)
    • CompressionLevel.Fastest for quicker compression (larger files)
    • CompressionLevel.NoCompression when you just need archiving without size reduction
  3. Parallel compression: For multiple files, consider implementing parallel compression for better performance:

csharp
public static void CompressFilesInParallel(string[] files, string outputDirectory)
{
try
{
Directory.CreateDirectory(outputDirectory);

Parallel.ForEach(files, file =>
{
string fileName = Path.GetFileName(file);
string outputPath = Path.Combine(outputDirectory, $"{fileName}.gz");

using (FileStream input = File.OpenRead(file))
using (FileStream output = File.Create(outputPath))
using (GZipStream gzipStream = new GZipStream(output, CompressionLevel.Optimal))
{
input.CopyTo(gzipStream);
}

Console.WriteLine($"Compressed: {fileName}");
});
}
catch (Exception ex)
{
Console.WriteLine($"Parallel compression error: {ex.Message}");
}
}

Common Issues and Troubleshooting

1. Using Compressed Streams Correctly

A common mistake is not closing the GZipStream or DeflateStream properly, which can lead to incomplete or corrupted output:

csharp
// ❌ Incorrect: The compression stream is closed too early
using (FileStream compressedFileStream = File.Create("output.gz"))
{
GZipStream compressionStream = new GZipStream(compressedFileStream, CompressionMode.Compress);
// compressionStream might be disposed before all data is written!
}

// ✅ Correct: Using nested using statements ensures proper disposal order
using (FileStream compressedFileStream = File.Create("output.gz"))
using (GZipStream compressionStream = new GZipStream(compressedFileStream, CompressionMode.Compress))
{
// Work with compression stream
}

2. Handling Path Issues in ZIP Archives

When creating ZIP archives, be careful with path handling:

csharp
// Create ZIP entries with proper relative paths
public static void AddDirectoryToZip(string directory, string zipPath)
{
try
{
using (FileStream zipFile = new FileStream(zipPath, FileMode.Create))
using (ZipArchive archive = new ZipArchive(zipFile, ZipArchiveMode.Create))
{
string baseDir = new DirectoryInfo(directory).FullName;

foreach (string filePath in Directory.GetFiles(directory, "*.*", SearchOption.AllDirectories))
{
// Create relative path for the entry
string relativePath = filePath.Substring(baseDir.Length + 1);
archive.CreateEntryFromFile(filePath, relativePath);
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Error adding directory to ZIP: {ex.Message}");
}
}

Summary

In this tutorial, we've explored the various ways to work with file compression in .NET:

  • Using GZipStream for standard GZip compression
  • Implementing DeflateStream for basic compression needs
  • Working with ZIP archives for multiple files using ZipArchive and ZipFile
  • Setting different compression levels to balance speed and efficiency
  • Creating real-world applications like log rotation and backup solutions

File compression is an essential technique for any application that needs to handle large amounts of data efficiently. By leveraging .NET's built-in compression libraries, you can significantly reduce storage requirements and improve data transfer speeds.

Additional Resources

Exercises

  1. Create a utility that compresses all text files in a directory to individual GZip files.
  2. Build a program that creates a ZIP archive of selected file types (e.g., all .jpg files) from a source directory.
  3. Implement a file backup solution that compresses files older than a specified date and moves them to an archive folder.
  4. Create a console application that can both compress and decompress files, letting the user specify the compression level.
  5. Extend the ZIP archive example to display compression ratios for each file in the archive.

Happy compressing!



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)