C# CSV Processing
Introduction
CSV (Comma-Separated Values) files are a common format for storing and exchanging tabular data. They're simple text files where each line represents a row of data, and values are separated by commas (or other delimiters). Processing CSV files is a fundamental skill for any programmer, as it allows you to work with data from various sources like spreadsheets, databases, or other applications.
In this tutorial, we'll cover how to:
- Read data from CSV files
- Parse and manipulate CSV data
- Write data to CSV files
- Handle common challenges with CSV processing
By the end of this guide, you'll be comfortable working with CSV files in your C# applications.
Basic CSV Structure
Before diving into the code, let's understand what a CSV file looks like:
Name,Age,Email
John,25,[email protected]
Alice,30,[email protected]
Bob,28,[email protected]
Each line is a record, and each value is separated by a comma. The first line is typically a header row that describes each column.
Reading CSV Files in C#
Method 1: Using the File Class (Simple Approach)
For basic CSV reading, we can use the File.ReadAllLines()
method:
using System;
using System.IO;
class Program
{
static void Main()
{
string filePath = "data.csv";
try
{
// Read all lines from the CSV file
string[] lines = File.ReadAllLines(filePath);
// Skip the header row and process each data row
bool isFirstLine = true;
foreach (string line in lines)
{
if (isFirstLine)
{
Console.WriteLine($"Headers: {line}");
isFirstLine = false;
continue;
}
// Split the line by comma
string[] values = line.Split(',');
// Process the values
Console.WriteLine($"Name: {values[0]}, Age: {values[1]}, Email: {values[2]}");
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Output:
Headers: Name,Age,Email
Name: John, Age: 25, Email: [email protected]
Name: Alice, Age: 30, Email: [email protected]
Name: Bob, Age: 28, Email: [email protected]
Method 2: Using StreamReader (More Efficient for Large Files)
For larger files, reading line by line is more efficient:
using System;
using System.IO;
class Program
{
static void Main()
{
string filePath = "data.csv";
try
{
using (StreamReader reader = new StreamReader(filePath))
{
// Read and display header
string headerLine = reader.ReadLine();
Console.WriteLine($"Headers: {headerLine}");
// Read each data line
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
string[] values = line.Split(',');
// Process the values
Console.WriteLine($"Name: {values[0]}, Age: {values[1]}, Email: {values[2]}");
}
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Handling CSV Data Challenges
Challenge 1: Values Containing Commas
CSV data gets tricky when the values themselves contain commas. In standard CSV, these values are enclosed in quotes:
Name,Description,Price
"Laptop","High-performance laptop, with SSD",999.99
"Phone","Smartphone, latest model",499.99
Here's how to handle quoted fields:
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
string csvLine = "\"Laptop\",\"High-performance laptop, with SSD\",999.99";
// Parse the CSV line properly handling quoted values
List<string> values = ParseCSVLine(csvLine);
foreach (string value in values)
{
Console.WriteLine($"Value: {value}");
}
}
static List<string> ParseCSVLine(string line)
{
List<string> result = new List<string>();
// Regular expression to match CSV fields
// Matches either a quoted field (handling embedded commas) or a non-quoted field
Regex csvRegex = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
string[] fields = csvRegex.Split(line);
// Clean up quotes from the result
for (int i = 0; i < fields.Length; i++)
{
fields[i] = fields[i].Trim();
// Remove surrounding quotes if present
if (fields[i].StartsWith("\"") && fields[i].EndsWith("\""))
{
fields[i] = fields[i].Substring(1, fields[i].Length - 2);
}
result.Add(fields[i]);
}
return result;
}
}
Output:
Value: Laptop
Value: High-performance laptop, with SSD
Value: 999.99
Challenge 2: Different Delimiters
Some CSV files use delimiters other than commas, such as semicolons or tabs:
using System;
using System.IO;
class Program
{
static void Main()
{
string fileContent = "Name;Age;Email\nJohn;25;[email protected]";
// Split by lines
string[] lines = fileContent.Split('\n');
foreach (string line in lines)
{
// Split by semicolon delimiter
string[] values = line.Split(';');
foreach (string value in values)
{
Console.Write($"{value}\t");
}
Console.WriteLine();
}
}
}
Output:
Name Age Email
John 25 [email protected]
Writing CSV Files
Creating CSV files is straightforward in C#:
using System;
using System.IO;
using System.Text;
class Program
{
static void Main()
{
string filePath = "output.csv";
try
{
// Create a list of users
var users = new[]
{
new { Name = "John", Age = 25, Email = "[email protected]" },
new { Name = "Alice", Age = 30, Email = "[email protected]" },
new { Name = "Bob", Age = 28, Email = "[email protected]" }
};
// Create a StringBuilder for building our CSV content
StringBuilder csvContent = new StringBuilder();
// Add header row
csvContent.AppendLine("Name,Age,Email");
// Add data rows
foreach (var user in users)
{
csvContent.AppendLine($"{user.Name},{user.Age},{user.Email}");
}
// Write the CSV content to a file
File.WriteAllText(filePath, csvContent.ToString());
Console.WriteLine($"CSV file written successfully to {filePath}");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Handling Special Characters When Writing CSV
When writing CSV files, you need to handle special characters properly:
using System;
using System.IO;
using System.Text;
class Program
{
static void Main()
{
string filePath = "output.csv";
try
{
// Create a list of products with descriptions containing commas
var products = new[]
{
new { Name = "Laptop", Description = "High-performance laptop, with SSD", Price = 999.99 },
new { Name = "Phone", Description = "Smartphone, latest model", Price = 499.99 }
};
// Create a StringBuilder for building our CSV content
StringBuilder csvContent = new StringBuilder();
// Add header row
csvContent.AppendLine("Name,Description,Price");
// Add data rows
foreach (var product in products)
{
// Escape fields with commas by enclosing them in quotes
string name = EscapeCsvField(product.Name);
string description = EscapeCsvField(product.Description);
csvContent.AppendLine($"{name},{description},{product.Price}");
}
// Write the CSV content to a file
File.WriteAllText(filePath, csvContent.ToString());
Console.WriteLine($"CSV file written successfully to {filePath}");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
static string EscapeCsvField(string field)
{
// If the field contains commas, quotes, or newlines, enclose it in quotes
if (field.Contains(",") || field.Contains("\"") || field.Contains("\n"))
{
// Double up any quotes inside the field
field = field.Replace("\"", "\"\"");
// Enclose the field in quotes
return $"\"{field}\"";
}
return field;
}
}
Using Third-Party Libraries
For more complex CSV handling, consider using third-party libraries like CsvHelper:
// First, install CsvHelper via NuGet:
// Install-Package CsvHelper
using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using CsvHelper;
class Program
{
public class Person
{
public string Name { get; set; }
public int Age { get; set; }
public string Email { get; set; }
}
static void Main()
{
string filePath = "data.csv";
try
{
// Reading CSV with CsvHelper
using (var reader = new StreamReader(filePath))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
var records = csv.GetRecords<Person>();
foreach (var record in records)
{
Console.WriteLine($"Name: {record.Name}, Age: {record.Age}, Email: {record.Email}");
}
}
// Writing CSV with CsvHelper
var people = new List<Person>
{
new Person { Name = "John", Age = 25, Email = "[email protected]" },
new Person { Name = "Alice", Age = 30, Email = "[email protected]" },
new Person { Name = "Bob", Age = 28, Email = "[email protected]" }
};
using (var writer = new StreamWriter("output.csv"))
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
csv.WriteRecords(people);
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
Real-World Application: Data Analysis
Let's create a more practical example where we analyze sales data from a CSV file:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
class Program
{
class SalesRecord
{
public string Product { get; set; }
public double Price { get; set; }
public int Quantity { get; set; }
public DateTime Date { get; set; }
public double Total => Price * Quantity;
}
static void Main()
{
string filePath = "sales.csv";
try
{
// Read and parse sales data
List<SalesRecord> salesData = new List<SalesRecord>();
string[] lines = File.ReadAllLines(filePath);
bool isFirstLine = true;
foreach (string line in lines)
{
if (isFirstLine)
{
isFirstLine = false;
continue; // Skip header
}
string[] values = line.Split(',');
SalesRecord record = new SalesRecord
{
Product = values[0],
Price = double.Parse(values[1]),
Quantity = int.Parse(values[2]),
Date = DateTime.Parse(values[3])
};
salesData.Add(record);
}
// Analyze sales data
double totalRevenue = salesData.Sum(s => s.Total);
var bestSellingProduct = salesData
.GroupBy(s => s.Product)
.OrderByDescending(g => g.Sum(s => s.Quantity))
.First()
.Key;
var monthlySales = salesData
.GroupBy(s => new { s.Date.Year, s.Date.Month })
.Select(g => new
{
YearMonth = $"{g.Key.Year}-{g.Key.Month:D2}",
TotalSales = g.Sum(s => s.Total)
})
.OrderBy(m => m.YearMonth);
// Output results
Console.WriteLine($"Total Revenue: ${totalRevenue:F2}");
Console.WriteLine($"Best Selling Product: {bestSellingProduct}");
Console.WriteLine("\nMonthly Sales:");
foreach (var month in monthlySales)
{
Console.WriteLine($"{month.YearMonth}: ${month.TotalSales:F2}");
}
// Export monthly sales report to CSV
using (StreamWriter writer = new StreamWriter("monthly_sales_report.csv"))
{
// Write header
writer.WriteLine("Month,Sales");
// Write data
foreach (var month in monthlySales)
{
writer.WriteLine($"{month.YearMonth},{month.TotalSales:F2}");
}
}
Console.WriteLine("\nMonthly sales report exported to monthly_sales_report.csv");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
}
}
For this example, you would need a sales.csv file with the following structure:
Product,Price,Quantity,Date
Laptop,999.99,2,2023-01-15
Phone,499.99,5,2023-01-20
Headphones,99.99,10,2023-02-05
Laptop,999.99,3,2023-02-10
Phone,499.99,8,2023-03-01
Summary
In this tutorial, we've covered:
- Basic CSV file reading using
File.ReadAllLines()
andStreamReader
- Handling common CSV challenges like quoted values and different delimiters
- Writing CSV files with proper escaping of special characters
- Using the popular CsvHelper library for more advanced scenarios
- Creating a real-world application that analyzes sales data from a CSV file
CSV processing is a fundamental skill for any C# developer, as it allows you to work with data from various sources. Whether you're building data analysis tools, importing/exporting data, or integrating with other systems, understanding CSV processing will be extremely valuable.
Additional Resources
- CsvHelper Documentation
- RFC 4180: Common Format and MIME Type for CSV Files
- Microsoft Docs: File.ReadAllLines Method
Exercises
- Create a program that reads a CSV file containing student grades and calculates the average grade for each student.
- Modify the sales analysis program to find the best-selling product for each month.
- Create a CSV file merger that combines multiple CSV files with the same header structure.
- Build a program that converts between CSV and JSON formats.
- Create a CSV validator that checks if a CSV file follows proper formatting rules and reports any issues.
Happy coding!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)