Skip to main content

C# Sets

Introduction

Sets are specialized collections in C# that store unique elements with no duplicates. They implement mathematical set operations like union, intersection, and difference. Unlike lists or arrays, sets don't maintain insertion order (except for SortedSet) and focus on fast lookups and eliminating duplicates.

In C#, sets are primarily represented by two collection types:

  • HashSet<T>: An unordered collection of unique elements
  • SortedSet<T>: An ordered collection of unique elements

In this guide, we'll explore how to work with sets in C#, providing practical examples and applications.

HashSet Overview

HashSet<T> is the most commonly used set implementation in C#. It provides:

  • Fast operations (near O(1) for add, remove, and contains)
  • Unordered storage of unique elements
  • Implementation of standard set operations

Creating a HashSet

Let's start by creating and using a simple HashSet:

csharp
using System;
using System.Collections.Generic;

// Creating an empty HashSet
HashSet<string> fruits = new HashSet<string>();

// Adding elements
fruits.Add("Apple");
fruits.Add("Banana");
fruits.Add("Orange");
fruits.Add("Apple"); // This won't be added (duplicate)

// Displaying elements
Console.WriteLine("Fruits in the set:");
foreach (string fruit in fruits)
{
Console.WriteLine(fruit);
}

// Output:
// Fruits in the set:
// Apple
// Banana
// Orange

Notice how the duplicate "Apple" wasn't added to the set. This automatic duplicate removal is a key feature of sets.

Creating a HashSet from an Existing Collection

You can initialize a HashSet with values from another collection:

csharp
// Creating from an array
string[] fruitArray = { "Apple", "Banana", "Orange", "Apple", "Grape" };
HashSet<string> fruitSet = new HashSet<string>(fruitArray);

Console.WriteLine($"Number of elements in array: {fruitArray.Length}"); // 5
Console.WriteLine($"Number of elements in set: {fruitSet.Count}"); // 4

Basic HashSet Operations

Checking if an Element Exists

csharp
HashSet<string> fruits = new HashSet<string> { "Apple", "Banana", "Orange" };

bool containsApple = fruits.Contains("Apple");
bool containsMango = fruits.Contains("Mango");

Console.WriteLine($"Contains Apple? {containsApple}"); // True
Console.WriteLine($"Contains Mango? {containsMango}"); // False

Removing Elements

csharp
HashSet<string> fruits = new HashSet<string> { "Apple", "Banana", "Orange", "Grape" };

// Remove a specific element
bool removed = fruits.Remove("Banana");
Console.WriteLine($"Banana removed: {removed}"); // True

// Try removing an element that doesn't exist
removed = fruits.Remove("Mango");
Console.WriteLine($"Mango removed: {removed}"); // False

// Displaying remaining elements
Console.WriteLine("Remaining fruits:");
foreach (var fruit in fruits)
{
Console.WriteLine(fruit);
}

// Output:
// Apple
// Orange
// Grape

Clearing a HashSet

csharp
HashSet<string> fruits = new HashSet<string> { "Apple", "Banana", "Orange" };
Console.WriteLine($"Count before clear: {fruits.Count}"); // 3

fruits.Clear();
Console.WriteLine($"Count after clear: {fruits.Count}"); // 0

Set Operations

One of the most powerful aspects of HashSet<T> is its built-in support for mathematical set operations.

Union of Sets

The union combines all elements from two sets, removing duplicates.

csharp
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };

// Method 1: Using UnionWith (modifies set1)
set1.UnionWith(set2);
Console.WriteLine("Union using UnionWith:");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 1 2 3 4 5 6

// Method 2: Using LINQ (creates a new set)
HashSet<int> set3 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set4 = new HashSet<int> { 3, 4, 5, 6 };
var unionSet = new HashSet<int>(set3);
unionSet.UnionWith(set4);

Console.WriteLine("\nUnion using LINQ:");
foreach (var item in unionSet)
{
Console.Write($"{item} ");
}
// Output: 1 2 3 4 5 6

Intersection of Sets

Intersection keeps only elements that appear in both sets.

csharp
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };

// Using IntersectWith (modifies set1)
set1.IntersectWith(set2);
Console.WriteLine("Intersection:");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 3 4

Difference of Sets

The difference contains elements from the first set that don't appear in the second set.

csharp
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };

// Using ExceptWith (modifies set1)
set1.ExceptWith(set2);
Console.WriteLine("Difference (set1 - set2):");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 1 2

Symmetric Difference

Symmetric difference contains elements that are in either set but not in both.

csharp
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };

// Using SymmetricExceptWith (modifies set1)
set1.SymmetricExceptWith(set2);
Console.WriteLine("Symmetric Difference:");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 1 2 5 6

Checking Subset and Superset Relationships

csharp
HashSet<int> smallSet = new HashSet<int> { 2, 3 };
HashSet<int> largeSet = new HashSet<int> { 1, 2, 3, 4, 5 };

bool isSubset = smallSet.IsSubsetOf(largeSet);
bool isProperSubset = smallSet.IsProperSubsetOf(largeSet);
bool isSuperset = largeSet.IsSupersetOf(smallSet);

Console.WriteLine($"smallSet is a subset of largeSet: {isSubset}"); // True
Console.WriteLine($"smallSet is a proper subset of largeSet: {isProperSubset}"); // True
Console.WriteLine($"largeSet is a superset of smallSet: {isSuperset}"); // True

SortedSet Overview

SortedSet<T> is similar to HashSet<T> but keeps elements in sorted order. This is useful when you need a collection of unique elements that are automatically sorted.

Creating a SortedSet

csharp
using System;
using System.Collections.Generic;

// Creating a SortedSet
SortedSet<int> numbers = new SortedSet<int>();

// Adding elements in random order
numbers.Add(5);
numbers.Add(2);
numbers.Add(10);
numbers.Add(1);
numbers.Add(5); // Duplicate won't be added

// Elements are automatically sorted
Console.WriteLine("Numbers in sorted order:");
foreach (var number in numbers)
{
Console.Write($"{number} ");
}
// Output: 1 2 5 10

Working with SortedSet

SortedSet<T> supports the same operations as HashSet<T> but maintains elements in sorted order:

csharp
SortedSet<string> sortedFruits = new SortedSet<string> 
{
"Orange", "Apple", "Banana", "Grape"
};

// Elements are automatically sorted alphabetically
Console.WriteLine("Sorted fruits:");
foreach (var fruit in sortedFruits)
{
Console.WriteLine(fruit);
}

// Output:
// Apple
// Banana
// Grape
// Orange

Range Operations with SortedSet

One advantage of SortedSet<T> is its ability to extract ranges:

csharp
SortedSet<int> sortedNumbers = new SortedSet<int> { 1, 3, 5, 7, 9, 11, 13, 15 };

// Get a subset view between 5 and 13 inclusive
var subset = sortedNumbers.GetViewBetween(5, 13);

Console.WriteLine("Numbers between 5 and 13:");
foreach (var number in subset)
{
Console.Write($"{number} ");
}
// Output: 5 7 9 11 13

Custom Objects in Sets

When storing custom objects in a set, you need to ensure that object equality and hash codes are properly defined. Otherwise, duplicate objects might not be detected correctly.

Example with Default Implementation (Flawed)

csharp
class Person
{
public string Name { get; set; }
public int Age { get; set; }
}

// Creating a set of Person objects
HashSet<Person> people = new HashSet<Person>();

// Adding seemingly duplicate objects
people.Add(new Person { Name = "John", Age = 30 });
people.Add(new Person { Name = "John", Age = 30 });

// Both objects are added because they're different instances
Console.WriteLine($"Count: {people.Count}"); // Output: 2

Solution: Implementing Equality

To fix this, we need to override Equals and GetHashCode in our custom class:

csharp
class Person : IEquatable<Person>
{
public string Name { get; set; }
public int Age { get; set; }

// Override Equals for IEquatable<Person>
public bool Equals(Person other)
{
if (other == null) return false;
return Name == other.Name && Age == other.Age;
}

// Override object.Equals
public override bool Equals(object obj)
{
return Equals(obj as Person);
}

// Override GetHashCode (important for HashSet!)
public override int GetHashCode()
{
return (Name, Age).GetHashCode();
}
}

// Now duplicates will be properly detected
HashSet<Person> people = new HashSet<Person>();
people.Add(new Person { Name = "John", Age = 30 });
people.Add(new Person { Name = "John", Age = 30 });

Console.WriteLine($"Count: {people.Count}"); // Output: 1

Real-world Applications of Sets

Sets are incredibly useful in many practical programming scenarios:

1. Removing Duplicates from a Collection

csharp
// Removing duplicates from a list
List<string> namesWithDuplicates = new List<string>
{
"Alice", "Bob", "Charlie", "Alice", "David", "Bob"
};

HashSet<string> uniqueNames = new HashSet<string>(namesWithDuplicates);

Console.WriteLine($"Original count: {namesWithDuplicates.Count}"); // 6
Console.WriteLine($"Unique count: {uniqueNames.Count}"); // 4

// Convert back to list if needed
List<string> uniqueNamesList = uniqueNames.ToList();

2. Tracking Visited Items

csharp
HashSet<string> visitedUrls = new HashSet<string>();

// Simulate web crawler
void CrawlPage(string url)
{
if (!visitedUrls.Add(url))
{
// URL was already in the set, we've visited it before
Console.WriteLine($"Already visited: {url}");
return;
}

Console.WriteLine($"Crawling: {url}");
// Process page...
}

// Usage
CrawlPage("https://example.com");
CrawlPage("https://example.com/about");
CrawlPage("https://example.com"); // This won't be processed again

3. Finding Common Elements

csharp
// Imagine tracking skills across job applicants
HashSet<string> requiredSkills = new HashSet<string>
{
"C#", "SQL", "JavaScript", "HTML", "CSS"
};

HashSet<string> candidateSkills = new HashSet<string>
{
"Java", "C#", "Python", "JavaScript"
};

// Find matching skills
HashSet<string> matchingSkills = new HashSet<string>(requiredSkills);
matchingSkills.IntersectWith(candidateSkills);

Console.WriteLine("Matching skills:");
foreach (var skill in matchingSkills)
{
Console.WriteLine(skill);
}
// Output: C#, JavaScript

// Find missing skills
HashSet<string> missingSkills = new HashSet<string>(requiredSkills);
missingSkills.ExceptWith(candidateSkills);

Console.WriteLine("Missing skills:");
foreach (var skill in missingSkills)
{
Console.WriteLine(skill);
}
// Output: SQL, HTML, CSS

4. Implementing a Dictionary Spell Checker

csharp
// Create a dictionary
HashSet<string> dictionary = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"apple", "banana", "orange", "grape", "programming", "computer"
};

bool CheckSpelling(string word)
{
return dictionary.Contains(word);
}

// Usage
Console.WriteLine($"Is 'apple' spelled correctly? {CheckSpelling("apple")}"); // True
Console.WriteLine($"Is 'Apple' spelled correctly? {CheckSpelling("Apple")}"); // True (case insensitive)
Console.WriteLine($"Is 'aple' spelled correctly? {CheckSpelling("aple")}"); // False

Performance Considerations

Sets in C# offer excellent performance for many operations:

  • HashSet<T>:

    • Add, Remove, Contains: O(1) average case
    • Union, Intersection, etc.: O(n) where n is the size of the smaller set
  • SortedSet<T>:

    • Add, Remove, Contains: O(log n)
    • Preserves order but is slightly slower than HashSet<T>

Choose HashSet<T> when you need maximum performance and don't care about order. Use SortedSet<T> when you need elements to be automatically sorted.

Summary

Sets in C# provide powerful tools for working with unique elements and performing set operations:

  • HashSet<T> offers fast, unordered storage of unique elements
  • SortedSet<T> maintains elements in sorted order
  • Both support mathematical set operations like union, intersection, and difference
  • Custom objects must override Equals and GetHashCode to work properly in sets
  • Sets excel in scenarios involving uniqueness, removing duplicates, and set operations

Sets are often the best choice when you need to:

  • Eliminate duplicates from a collection
  • Check for existence quickly
  • Find common or unique elements between collections
  • Maintain a unique collection of items

Additional Resources

Exercises

  1. Create a program that takes two lists of integers and finds the intersection and union of them using sets.
  2. Implement a custom class Student with properties Id, Name, and GPA that works correctly in a HashSet.
  3. Write a program that reads a text file and counts the number of unique words using a set.
  4. Create a spell checker that suggests corrections for misspelled words by finding words in a dictionary within an edit distance of 1.
  5. Implement a program that finds all anagrams of a given word in a dictionary using sets.


If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)