Skip to main content

.NET Task Parallel Library

Introduction

The Task Parallel Library (TPL) is a powerful component of the .NET Framework that simplifies the process of writing concurrent and parallel code. Introduced with .NET Framework 4.0, TPL provides developers with a higher-level abstraction for working with threads and asynchronous operations, allowing you to focus on the business logic rather than the complex details of thread management.

In today's world, most computers have multiple processor cores. Traditional sequential programming doesn't take advantage of these additional computing resources, but parallel programming allows your applications to distribute work across these cores, potentially improving performance significantly. The TPL helps you achieve this parallelism with minimal complexity.

Why Use the Task Parallel Library?

Before diving into the details, let's understand why you might want to use the TPL:

  • Improved Performance: Utilize all available CPU cores for compute-intensive operations
  • Responsive Applications: Keep your UI thread free while performing heavy operations
  • Simplified Code: Abstract away the complexity of thread management
  • Scalability: Your code can automatically scale based on available resources
  • Better Resource Utilization: Make efficient use of system resources

Core Components of TPL

The Task Parallel Library consists of several key components:

  1. Task Class: Represents an asynchronous operation
  2. Parallel Class: Provides parallel versions of common loops and operations
  3. PLINQ (Parallel LINQ): Enables parallel execution of LINQ queries
  4. Concurrent Collections: Thread-safe collection classes for concurrent access

Let's explore each of these components in detail.

Working with Tasks

At the heart of TPL is the Task class, which represents an asynchronous operation. Think of a Task as a "promise" to complete some work in the future.

Creating and Running Tasks

The most basic way to create a task is using the Task.Run method:

csharp
Task task = Task.Run(() => 
{
Console.WriteLine("Task is running...");
// Do some work here
});

// Wait for the task to complete
task.Wait();
Console.WriteLine("Task has completed.");

Output:

Task is running...
Task has completed.

Returning Values from Tasks

Tasks can also return values using the generic Task<TResult> type:

csharp
Task<int> calculateTask = Task.Run(() =>
{
Console.WriteLine("Calculating...");
// Simulate work
Thread.Sleep(2000);
return 42;
});

// Get the result (will wait until the task completes)
int result = calculateTask.Result;
Console.WriteLine($"The answer is: {result}");

Output:

Calculating...
The answer is: 42

Working with Multiple Tasks

You can wait for multiple tasks to complete using methods like Task.WhenAll:

csharp
Task task1 = Task.Run(() =>
{
Thread.Sleep(1000);
Console.WriteLine("Task 1 completed");
});

Task task2 = Task.Run(() =>
{
Thread.Sleep(2000);
Console.WriteLine("Task 2 completed");
});

// Wait for both tasks to complete
Task.WhenAll(task1, task2).Wait();
Console.WriteLine("All tasks completed");

Output:

Task 1 completed
Task 2 completed
All tasks completed

Or wait for the first task to complete with Task.WhenAny:

csharp
Task<string> task1 = Task.Run(async () =>
{
await Task.Delay(2000);
return "Task 1 result";
});

Task<string> task2 = Task.Run(async () =>
{
await Task.Delay(1000);
return "Task 2 result";
});

Task<Task<string>> firstCompletedTask = Task.WhenAny(task1, task2);
Task<string> winnerTask = firstCompletedTask.Result;
string result = winnerTask.Result;

Console.WriteLine($"First completed: {result}");

Output:

First completed: Task 2 result

Parallel Class

The Parallel class provides parallelized versions of common programming constructs like loops. This makes it easy to distribute work across multiple cores.

Parallel.For

The Parallel.For method is a parallel version of a traditional for loop:

csharp
// Sequential for loop
Console.WriteLine("Sequential loop:");
for (int i = 0; i < 5; i++)
{
Console.WriteLine($"Sequential iteration {i} on thread {Thread.CurrentThread.ManagedThreadId}");
}

// Parallel for loop
Console.WriteLine("\nParallel loop:");
Parallel.For(0, 5, i =>
{
Console.WriteLine($"Parallel iteration {i} on thread {Thread.CurrentThread.ManagedThreadId}");
});

Output (may vary):

Sequential loop:
Sequential iteration 0 on thread 1
Sequential iteration 1 on thread 1
Sequential iteration 2 on thread 1
Sequential iteration 3 on thread 1
Sequential iteration 4 on thread 1

Parallel loop:
Parallel iteration 1 on thread 4
Parallel iteration 0 on thread 5
Parallel iteration 2 on thread 6
Parallel iteration 4 on thread 7
Parallel iteration 3 on thread 8

Notice how the parallel version executes on different threads.

Parallel.ForEach

Similarly, Parallel.ForEach parallelizes a foreach loop:

csharp
List<string> items = new List<string> { "Item1", "Item2", "Item3", "Item4", "Item5" };

Parallel.ForEach(items, item =>
{
Console.WriteLine($"Processing {item} on thread {Thread.CurrentThread.ManagedThreadId}");
// Simulate work
Thread.Sleep(100);
});

Output (may vary):

Processing Item2 on thread 4
Processing Item1 on thread 5
Processing Item3 on thread 6
Processing Item4 on thread 7
Processing Item5 on thread 8

Parallel.Invoke

The Parallel.Invoke method allows you to execute multiple actions in parallel:

csharp
Parallel.Invoke(
() =>
{
Console.WriteLine($"Action 1 running on thread {Thread.CurrentThread.ManagedThreadId}");
Thread.Sleep(1000);
},
() =>
{
Console.WriteLine($"Action 2 running on thread {Thread.CurrentThread.ManagedThreadId}");
Thread.Sleep(1000);
},
() =>
{
Console.WriteLine($"Action 3 running on thread {Thread.CurrentThread.ManagedThreadId}");
Thread.Sleep(1000);
}
);

Console.WriteLine("All actions completed");

Output (may vary):

Action 1 running on thread 4
Action 2 running on thread 5
Action 3 running on thread 6
All actions completed

Parallel LINQ (PLINQ)

PLINQ is a parallel implementation of LINQ that enables you to execute LINQ queries in parallel:

csharp
// Sequential LINQ query
var numbers = Enumerable.Range(1, 10000000);
var sequentialResult = numbers.Where(n => n % 2 == 0).Count();

// Parallel LINQ query
var parallelResult = numbers.AsParallel().Where(n => n % 2 == 0).Count();

Console.WriteLine($"Sequential count: {sequentialResult}");
Console.WriteLine($"Parallel count: {parallelResult}");

Output:

Sequential count: 5000000
Parallel count: 5000000

The results are the same, but the parallel version may execute significantly faster on multi-core machines. To measure the performance difference:

csharp
var numbers = Enumerable.Range(1, 10000000);
var stopwatch = new System.Diagnostics.Stopwatch();

// Time sequential execution
stopwatch.Start();
var sequentialResult = numbers.Where(n => IsPrime(n)).Count();
stopwatch.Stop();
var sequentialTime = stopwatch.ElapsedMilliseconds;

// Time parallel execution
stopwatch.Restart();
var parallelResult = numbers.AsParallel().Where(n => IsPrime(n)).Count();
stopwatch.Stop();
var parallelTime = stopwatch.ElapsedMilliseconds;

Console.WriteLine($"Sequential execution took: {sequentialTime} ms");
Console.WriteLine($"Parallel execution took: {parallelTime} ms");
Console.WriteLine($"Speedup: {(double)sequentialTime / parallelTime:F2}x");

// Helper method to check if a number is prime (deliberately inefficient for demonstration)
bool IsPrime(int n)
{
if (n < 2) return false;
for (int i = 2; i <= Math.Sqrt(n); i++)
{
if (n % i == 0) return false;
}
return true;
}

Concurrent Collections

Regular .NET collections are not thread-safe, which means they can become corrupted if accessed by multiple threads simultaneously. TPL provides thread-safe collection classes in the System.Collections.Concurrent namespace:

  • ConcurrentBag<T>: For unordered collections
  • ConcurrentQueue<T>: For FIFO (First-In-First-Out) collections
  • ConcurrentStack<T>: For LIFO (Last-In-First-Out) collections
  • ConcurrentDictionary<TKey, TValue>: For key-value pairs

Here's an example using ConcurrentBag<T>:

csharp
// This could cause issues with a regular List
ConcurrentBag<int> bag = new ConcurrentBag<int>();

Parallel.For(0, 1000, i =>
{
bag.Add(i);
});

Console.WriteLine($"Bag contains {bag.Count} items");

Output:

Bag contains 1000 items

Real-World Example: Parallel Image Processing

Let's look at a practical example of how TPL can be used to improve performance in a real-world scenario. Imagine we need to apply image filters to a large collection of images:

csharp
public class ImageProcessor
{
public void ProcessImagesSequentially(List<string> imagePaths)
{
foreach (var path in imagePaths)
{
ApplyFilter(path);
}
}

public void ProcessImagesInParallel(List<string> imagePaths)
{
Parallel.ForEach(imagePaths, path =>
{
ApplyFilter(path);
});
}

private void ApplyFilter(string imagePath)
{
Console.WriteLine($"Processing image: {imagePath} on thread {Thread.CurrentThread.ManagedThreadId}");
// Simulate image processing
Thread.Sleep(500);
// In a real application, this would load the image,
// apply filters, and save it back to disk
}
}

// Usage:
List<string> images = new List<string>
{
"photo1.jpg", "photo2.jpg", "photo3.jpg", "photo4.jpg",
"photo5.jpg", "photo6.jpg", "photo7.jpg", "photo8.jpg"
};

var processor = new ImageProcessor();

var stopwatch = new System.Diagnostics.Stopwatch();

stopwatch.Start();
processor.ProcessImagesSequentially(images);
stopwatch.Stop();
Console.WriteLine($"Sequential processing took: {stopwatch.ElapsedMilliseconds} ms");

stopwatch.Restart();
processor.ProcessImagesInParallel(images);
stopwatch.Stop();
Console.WriteLine($"Parallel processing took: {stopwatch.ElapsedMilliseconds} ms");

Output (may vary):

Processing image: photo1.jpg on thread 1
Processing image: photo2.jpg on thread 1
...
Processing image: photo8.jpg on thread 1
Sequential processing took: 4000 ms

Processing image: photo1.jpg on thread 5
Processing image: photo2.jpg on thread 6
Processing image: photo3.jpg on thread 7
Processing image: photo4.jpg on thread 8
Processing image: photo5.jpg on thread 4
Processing image: photo6.jpg on thread 9
Processing image: photo7.jpg on thread 10
Processing image: photo8.jpg on thread 11
Parallel processing took: 512 ms

On a multi-core machine, the parallel version can be substantially faster, especially for I/O-bound or CPU-intensive operations.

Best Practices and Considerations

When using TPL, keep these guidelines in mind:

  1. Not Everything Should Be Parallel: Parallelism involves overhead. For small workloads, sequential processing might be faster.

  2. Avoid Shared State: Minimize shared state between tasks to avoid contention and the need for synchronization.

  3. Handle Exceptions: Tasks store exceptions, which can be accessed via the AggregateException from methods like Wait() or the Result property.

  4. Avoid Blocking: In UI applications, avoid blocking the UI thread with .Wait() or .Result. Use async/await instead.

  5. Be Careful with Nested Parallelism: Nested parallelism (e.g., a Parallel.For inside another Parallel.For) can lead to thread pool starvation.

  6. Measure Performance: Always benchmark your code to ensure parallelism is actually improving performance.

Exception Handling in TPL

Here's how to handle exceptions in parallel code:

csharp
try
{
Parallel.For(0, 100, i =>
{
if (i == 50)
{
throw new Exception($"Error at iteration {i}");
}
// Process i
});
}
catch (AggregateException ae)
{
foreach (var ex in ae.InnerExceptions)
{
Console.WriteLine($"Exception: {ex.Message}");
}
}

Integration with Async/Await

TPL works seamlessly with the async/await pattern, providing a powerful way to write asynchronous and parallel code:

csharp
public async Task ProcessFilesAsync(List<string> filePaths)
{
// Create a list of tasks
List<Task> fileTasks = new List<Task>();

foreach (var path in filePaths)
{
// Start each task and add it to the list
fileTasks.Add(ProcessFileAsync(path));
}

// Wait for all tasks to complete
await Task.WhenAll(fileTasks);
Console.WriteLine("All files processed");
}

private async Task ProcessFileAsync(string filePath)
{
Console.WriteLine($"Starting to process {filePath}");

// Simulate async file processing
await Task.Delay(1000);

Console.WriteLine($"Finished processing {filePath}");
}

// Usage
public async Task RunExample()
{
List<string> files = new List<string>
{
"file1.txt", "file2.txt", "file3.txt", "file4.txt"
};

await ProcessFilesAsync(files);
}

Summary

The Task Parallel Library is a powerful toolkit for writing concurrent and parallel code in .NET. It offers several major advantages:

  • Easy parallelization of loops and operations with the Parallel class
  • Task-based asynchronous programming with the Task class
  • Parallel LINQ queries with PLINQ
  • Thread-safe collections for concurrent access

By using TPL, you can take full advantage of modern multi-core processors to improve the performance of your applications while maintaining readable and maintainable code. Remember that parallelism adds complexity, so always measure performance gains and use parallel programming judiciously.

Exercises

  1. Create a parallel program that counts the frequency of words in multiple text files simultaneously.
  2. Implement a parallel image processing application that applies different filters to an image collection.
  3. Write a program that downloads multiple web pages concurrently using Task.
  4. Create a simple multi-threaded producer-consumer application using a ConcurrentQueue.
  5. Compare the performance of sequential LINQ vs. PLINQ for different sizes of datasets.

Additional Resources



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)