C# File Compression
Introduction
File compression is an essential technique in modern software development that helps reduce storage requirements and improve data transfer speeds. In C#, the .NET Framework provides powerful built-in libraries that make it easy to compress and decompress files with just a few lines of code.
In this tutorial, we'll explore how to work with file compression in C#. You'll learn to:
- Use System.IO.Compression namespace
- Create ZIP archives
- Work with GZIP compression
- Implement practical compression scenarios
- Handle compressed data efficiently
Whether you're building applications that need to save storage space, improve download speeds, or create backup solutions, understanding file compression is a valuable skill for any C# developer.
Understanding Compression in .NET
The .NET Framework offers two primary compression formats:
- ZIP - For compressing multiple files and directories into a single archive
- GZIP - For compressing single files or data streams
Both formats are accessible through the System.IO.Compression
namespace, which provides classes like ZipFile
, ZipArchive
, GZipStream
, and DeflateStream
.
Getting Started with ZIP Compression
The System.IO.Compression
namespace contains classes for working with ZIP archives. Let's start with a simple example of creating a ZIP file.
Creating a ZIP Archive
using System;
using System.IO;
using System.IO.Compression;
class Program
{
static void Main(string[] args)
{
string startPath = @"C:\ExampleFiles"; // Directory to compress
string zipPath = @"C:\Example.zip"; // Output ZIP file
try
{
// Delete the file if it exists
if (File.Exists(zipPath))
{
File.Delete(zipPath);
}
// Create the ZIP archive
ZipFile.CreateFromDirectory(startPath, zipPath);
Console.WriteLine($"Successfully created ZIP archive at {zipPath}");
Console.WriteLine($"Original folder size: {GetDirectorySize(startPath)} bytes");
Console.WriteLine($"ZIP file size: {new FileInfo(zipPath).Length} bytes");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
// Helper method to calculate directory size
static long GetDirectorySize(string folderPath)
{
DirectoryInfo di = new DirectoryInfo(folderPath);
return di.EnumerateFiles("*", SearchOption.AllDirectories).Sum(fi => fi.Length);
}
}
Output:
Successfully created ZIP archive at C:\Example.zip
Original folder size: 25600 bytes
ZIP file size: 10240 bytes
Extracting a ZIP Archive
Once you've created a ZIP archive, you may need to extract its contents. Here's how to do it:
using System;
using System.IO;
using System.IO.Compression;
class Program
{
static void Main(string[] args)
{
string zipPath = @"C:\Example.zip"; // ZIP file to extract
string extractPath = @"C:\ExtractedFiles"; // Destination directory
try
{
// Create the extraction directory if it doesn't exist
if (!Directory.Exists(extractPath))
{
Directory.CreateDirectory(extractPath);
}
// Extract the ZIP archive
ZipFile.ExtractToDirectory(zipPath, extractPath);
Console.WriteLine($"Successfully extracted ZIP archive to {extractPath}");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Working with ZIP Archives Programmatically
For more control over ZIP operations, you can use the ZipArchive
class to manipulate ZIP files programmatically.
Adding Files to an Existing ZIP Archive
using System;
using System.IO;
using System.IO.Compression;
class Program
{
static void Main(string[] args)
{
string zipPath = @"C:\Example.zip";
string fileToAdd = @"C:\ExampleFiles\newfile.txt";
try
{
using (FileStream zipToOpen = new FileStream(zipPath, FileMode.Open))
{
using (ZipArchive archive = new ZipArchive(zipToOpen, ZipArchiveMode.Update))
{
// Create a new entry in the ZIP archive
string entryName = Path.GetFileName(fileToAdd);
ZipArchiveEntry entry = archive.CreateEntry(entryName);
// Write the file content to the entry
using (FileStream fileStream = new FileStream(fileToAdd, FileMode.Open))
{
using (Stream entryStream = entry.Open())
{
fileStream.CopyTo(entryStream);
}
}
}
}
Console.WriteLine($"Successfully added {Path.GetFileName(fileToAdd)} to the ZIP archive.");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Reading Files from a ZIP Archive Without Extraction
Sometimes you may want to read a file's content from a ZIP archive without extracting the entire archive:
using System;
using System.IO;
using System.IO.Compression;
using System.Text;
class Program
{
static void Main(string[] args)
{
string zipPath = @"C:\Example.zip";
string fileToRead = "example.txt"; // File within the ZIP archive
try
{
using (FileStream zipToOpen = new FileStream(zipPath, FileMode.Open))
{
using (ZipArchive archive = new ZipArchive(zipToOpen, ZipArchiveMode.Read))
{
// Find the entry in the archive
ZipArchiveEntry entry = archive.GetEntry(fileToRead);
if (entry != null)
{
// Read the content of the entry
using (StreamReader reader = new StreamReader(entry.Open()))
{
string content = reader.ReadToEnd();
Console.WriteLine($"Content of {fileToRead}:");
Console.WriteLine(content);
}
}
else
{
Console.WriteLine($"File {fileToRead} not found in the ZIP archive.");
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
GZIP Compression for Single Files
While ZIP is great for multiple files, GZIP is often more suitable for compressing individual files or data streams. Let's see how to use GZIP compression:
Compressing a File with GZIP
using System;
using System.IO;
using System.IO.Compression;
class Program
{
static void Main(string[] args)
{
string originalFile = @"C:\ExampleFiles\example.txt";
string compressedFile = @"C:\ExampleFiles\example.gz";
try
{
// Compress the file
using (FileStream originalFileStream = File.Open(originalFile, FileMode.Open))
{
using (FileStream compressedFileStream = File.Create(compressedFile))
{
using (GZipStream compressionStream = new GZipStream(compressedFileStream, CompressionMode.Compress))
{
originalFileStream.CopyTo(compressionStream);
}
}
}
FileInfo originalInfo = new FileInfo(originalFile);
FileInfo compressedInfo = new FileInfo(compressedFile);
Console.WriteLine($"Original file size: {originalInfo.Length} bytes");
Console.WriteLine($"Compressed file size: {compressedInfo.Length} bytes");
Console.WriteLine($"Compression ratio: {100.0 * (1.0 - ((double)compressedInfo.Length / originalInfo.Length)):0.##}%");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Output:
Original file size: 15360 bytes
Compressed file size: 5120 bytes
Compression ratio: 66.67%
Decompressing a GZIP File
using System;
using System.IO;
using System.IO.Compression;
class Program
{
static void Main(string[] args)
{
string compressedFile = @"C:\ExampleFiles\example.gz";
string decompressedFile = @"C:\ExampleFiles\example_decompressed.txt";
try
{
// Decompress the file
using (FileStream compressedFileStream = File.Open(compressedFile, FileMode.Open))
{
using (FileStream outputFileStream = File.Create(decompressedFile))
{
using (GZipStream decompressionStream = new GZipStream(compressedFileStream, CompressionMode.Decompress))
{
decompressionStream.CopyTo(outputFileStream);
}
}
}
Console.WriteLine($"Successfully decompressed file to {decompressedFile}");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Practical Example: Compressing Log Files
Let's create a practical example that automatically compresses log files when they reach a certain size. This is a common requirement for applications that generate large log files:
using System;
using System.IO;
using System.IO.Compression;
class LogCompressor
{
private readonly string _logDirectory;
private readonly long _maxSizeBytes;
public LogCompressor(string logDirectory, long maxSizeBytes)
{
_logDirectory = logDirectory;
_maxSizeBytes = maxSizeBytes;
}
public void CompressOversizedLogs()
{
// Get all log files in the directory
string[] logFiles = Directory.GetFiles(_logDirectory, "*.log");
foreach (string logFile in logFiles)
{
FileInfo fileInfo = new FileInfo(logFile);
// Check if the file is larger than the maximum size
if (fileInfo.Length > _maxSizeBytes)
{
string timestamp = DateTime.Now.ToString("yyyyMMddHHmmss");
string compressedFileName = $"{Path.GetFileNameWithoutExtension(logFile)}_{timestamp}.gz";
string compressedFilePath = Path.Combine(_logDirectory, compressedFileName);
// Compress the log file
CompressFile(logFile, compressedFilePath);
// Delete the original log file
File.Delete(logFile);
Console.WriteLine($"Compressed {fileInfo.Name} to {compressedFileName}");
Console.WriteLine($"Original size: {fileInfo.Length} bytes");
Console.WriteLine($"Compressed size: {new FileInfo(compressedFilePath).Length} bytes");
}
}
}
private void CompressFile(string sourcePath, string destinationPath)
{
using (FileStream sourceStream = new FileStream(sourcePath, FileMode.Open, FileAccess.Read))
{
using (FileStream destinationStream = new FileStream(destinationPath, FileMode.Create))
{
using (GZipStream compressionStream = new GZipStream(destinationStream, CompressionLevel.Optimal))
{
sourceStream.CopyTo(compressionStream);
}
}
}
}
}
class Program
{
static void Main(string[] args)
{
string logDirectory = @"C:\Logs";
long maxSizeBytes = 1024 * 1024; // 1MB
LogCompressor compressor = new LogCompressor(logDirectory, maxSizeBytes);
compressor.CompressOversizedLogs();
Console.WriteLine("Log compression process completed.");
}
}
Compression Levels
.NET allows you to specify compression levels when using GZIP or Deflate compression:
using System;
using System.IO;
using System.IO.Compression;
using System.Diagnostics;
class Program
{
static void Main(string[] args)
{
string sourceFile = @"C:\ExampleFiles\largefile.txt";
// Test different compression levels
TestCompression(sourceFile, CompressionLevel.Fastest);
TestCompression(sourceFile, CompressionLevel.Optimal);
TestCompression(sourceFile, CompressionLevel.NoCompression);
}
static void TestCompression(string sourceFile, CompressionLevel compressionLevel)
{
string destinationFile = $@"C:\ExampleFiles\compressed_{compressionLevel}.gz";
// Create a stopwatch to measure performance
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
// Compress the file
using (FileStream sourceStream = new FileStream(sourceFile, FileMode.Open))
{
using (FileStream destinationStream = new FileStream(destinationFile, FileMode.Create))
{
using (GZipStream compressionStream = new GZipStream(destinationStream, compressionLevel))
{
sourceStream.CopyTo(compressionStream);
}
}
}
stopwatch.Stop();
// Get file sizes
long originalSize = new FileInfo(sourceFile).Length;
long compressedSize = new FileInfo(destinationFile).Length;
Console.WriteLine($"Compression Level: {compressionLevel}");
Console.WriteLine($"Original Size: {originalSize:N0} bytes");
Console.WriteLine($"Compressed Size: {compressedSize:N0} bytes");
Console.WriteLine($"Compression Ratio: {100.0 * (1.0 - ((double)compressedSize / originalSize)):0.##}%");
Console.WriteLine($"Compression Time: {stopwatch.ElapsedMilliseconds} ms");
Console.WriteLine();
}
}
Output:
Compression Level: Fastest
Original Size: 10,485,760 bytes
Compressed Size: 4,329,301 bytes
Compression Ratio: 58.71%
Compression Time: 453 ms
Compression Level: Optimal
Original Size: 10,485,760 bytes
Compressed Size: 3,812,289 bytes
Compression Ratio: 63.64%
Compression Time: 1,372 ms
Compression Level: NoCompression
Original Size: 10,485,760 bytes
Compressed Size: 10,486,074 bytes
Compression Ratio: -0.00%
Compression Time: 128 ms
Memory-Efficient Compression with Streaming
For large files, memory efficiency is crucial. Here's how to process a large file in chunks using streaming:
using System;
using System.IO;
using System.IO.Compression;
class Program
{
static void Main(string[] args)
{
string sourceFile = @"C:\ExampleFiles\verylargefile.dat";
string compressedFile = @"C:\ExampleFiles\compressed.gz";
try
{
CompressFileInChunks(sourceFile, compressedFile, 4096); // 4KB chunks
Console.WriteLine($"Successfully compressed {sourceFile} to {compressedFile}");
FileInfo originalInfo = new FileInfo(sourceFile);
FileInfo compressedInfo = new FileInfo(compressedFile);
Console.WriteLine($"Original file size: {originalInfo.Length:N0} bytes");
Console.WriteLine($"Compressed file size: {compressedInfo.Length:N0} bytes");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
static void CompressFileInChunks(string sourcePath, string destinationPath, int bufferSize)
{
using (FileStream sourceStream = new FileStream(sourcePath, FileMode.Open, FileAccess.Read))
{
using (FileStream destinationStream = new FileStream(destinationPath, FileMode.Create))
{
using (GZipStream compressionStream = new GZipStream(destinationStream, CompressionLevel.Optimal))
{
byte[] buffer = new byte[bufferSize];
int bytesRead;
// Read and compress the file in chunks
while ((bytesRead = sourceStream.Read(buffer, 0, buffer.Length)) > 0)
{
compressionStream.Write(buffer, 0, bytesRead);
}
}
}
}
}
}
Summary
In this tutorial, we've covered the essential aspects of file compression in C#:
-
ZIP Compression
- Creating ZIP archives
- Extracting ZIP archives
- Programmatically working with ZIP files
-
GZIP Compression
- Compressing individual files
- Decompressing GZIP files
- Setting compression levels
-
Practical Applications
- Log file compression
- Memory-efficient streaming compression
- Performance comparisons
File compression is a powerful technique for reducing storage requirements and improving data transfer speeds. The .NET Framework provides comprehensive tools for working with compressed files, making it accessible even for beginners.
Additional Resources
- Microsoft Documentation on System.IO.Compression
- GZIP File Format Specification
- ZIP File Format Specification
Exercises
- Create a program that compresses all text files in a directory and its subdirectories into a single ZIP archive.
- Implement a backup utility that compresses modified files daily and maintains a log of compressed files.
- Write a program that reads CSV data from a GZIP file without fully decompressing it to disk.
- Create a benchmark comparing compression performance and ratios between GZIP, Deflate, and BrotliStream (available in .NET Core).
- Implement a compression method that preserves the directory structure when compressing multiple folders.
By mastering file compression techniques in C#, you'll be able to create more efficient applications that optimize storage and data transfer requirements.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)