C# Sets
Introduction
Sets are specialized collections in C# that store unique elements with no duplicates. They implement mathematical set operations like union, intersection, and difference. Unlike lists or arrays, sets don't maintain insertion order (except for SortedSet
) and focus on fast lookups and eliminating duplicates.
In C#, sets are primarily represented by two collection types:
HashSet<T>
: An unordered collection of unique elementsSortedSet<T>
: An ordered collection of unique elements
In this guide, we'll explore how to work with sets in C#, providing practical examples and applications.
HashSet Overview
HashSet<T>
is the most commonly used set implementation in C#. It provides:
- Fast operations (near O(1) for add, remove, and contains)
- Unordered storage of unique elements
- Implementation of standard set operations
Creating a HashSet
Let's start by creating and using a simple HashSet
:
using System;
using System.Collections.Generic;
// Creating an empty HashSet
HashSet<string> fruits = new HashSet<string>();
// Adding elements
fruits.Add("Apple");
fruits.Add("Banana");
fruits.Add("Orange");
fruits.Add("Apple"); // This won't be added (duplicate)
// Displaying elements
Console.WriteLine("Fruits in the set:");
foreach (string fruit in fruits)
{
Console.WriteLine(fruit);
}
// Output:
// Fruits in the set:
// Apple
// Banana
// Orange
Notice how the duplicate "Apple" wasn't added to the set. This automatic duplicate removal is a key feature of sets.
Creating a HashSet from an Existing Collection
You can initialize a HashSet
with values from another collection:
// Creating from an array
string[] fruitArray = { "Apple", "Banana", "Orange", "Apple", "Grape" };
HashSet<string> fruitSet = new HashSet<string>(fruitArray);
Console.WriteLine($"Number of elements in array: {fruitArray.Length}"); // 5
Console.WriteLine($"Number of elements in set: {fruitSet.Count}"); // 4
Basic HashSet Operations
Checking if an Element Exists
HashSet<string> fruits = new HashSet<string> { "Apple", "Banana", "Orange" };
bool containsApple = fruits.Contains("Apple");
bool containsMango = fruits.Contains("Mango");
Console.WriteLine($"Contains Apple? {containsApple}"); // True
Console.WriteLine($"Contains Mango? {containsMango}"); // False
Removing Elements
HashSet<string> fruits = new HashSet<string> { "Apple", "Banana", "Orange", "Grape" };
// Remove a specific element
bool removed = fruits.Remove("Banana");
Console.WriteLine($"Banana removed: {removed}"); // True
// Try removing an element that doesn't exist
removed = fruits.Remove("Mango");
Console.WriteLine($"Mango removed: {removed}"); // False
// Displaying remaining elements
Console.WriteLine("Remaining fruits:");
foreach (var fruit in fruits)
{
Console.WriteLine(fruit);
}
// Output:
// Apple
// Orange
// Grape
Clearing a HashSet
HashSet<string> fruits = new HashSet<string> { "Apple", "Banana", "Orange" };
Console.WriteLine($"Count before clear: {fruits.Count}"); // 3
fruits.Clear();
Console.WriteLine($"Count after clear: {fruits.Count}"); // 0
Set Operations
One of the most powerful aspects of HashSet<T>
is its built-in support for mathematical set operations.
Union of Sets
The union combines all elements from two sets, removing duplicates.
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };
// Method 1: Using UnionWith (modifies set1)
set1.UnionWith(set2);
Console.WriteLine("Union using UnionWith:");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 1 2 3 4 5 6
// Method 2: Using LINQ (creates a new set)
HashSet<int> set3 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set4 = new HashSet<int> { 3, 4, 5, 6 };
var unionSet = new HashSet<int>(set3);
unionSet.UnionWith(set4);
Console.WriteLine("\nUnion using LINQ:");
foreach (var item in unionSet)
{
Console.Write($"{item} ");
}
// Output: 1 2 3 4 5 6
Intersection of Sets
Intersection keeps only elements that appear in both sets.
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };
// Using IntersectWith (modifies set1)
set1.IntersectWith(set2);
Console.WriteLine("Intersection:");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 3 4
Difference of Sets
The difference contains elements from the first set that don't appear in the second set.
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };
// Using ExceptWith (modifies set1)
set1.ExceptWith(set2);
Console.WriteLine("Difference (set1 - set2):");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 1 2
Symmetric Difference
Symmetric difference contains elements that are in either set but not in both.
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };
// Using SymmetricExceptWith (modifies set1)
set1.SymmetricExceptWith(set2);
Console.WriteLine("Symmetric Difference:");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 1 2 5 6
Checking Subset and Superset Relationships
HashSet<int> smallSet = new HashSet<int> { 2, 3 };
HashSet<int> largeSet = new HashSet<int> { 1, 2, 3, 4, 5 };
bool isSubset = smallSet.IsSubsetOf(largeSet);
bool isProperSubset = smallSet.IsProperSubsetOf(largeSet);
bool isSuperset = largeSet.IsSupersetOf(smallSet);
Console.WriteLine($"smallSet is a subset of largeSet: {isSubset}"); // True
Console.WriteLine($"smallSet is a proper subset of largeSet: {isProperSubset}"); // True
Console.WriteLine($"largeSet is a superset of smallSet: {isSuperset}"); // True
SortedSet Overview
SortedSet<T>
is similar to HashSet<T>
but keeps elements in sorted order. This is useful when you need a collection of unique elements that are automatically sorted.
Creating a SortedSet
using System;
using System.Collections.Generic;
// Creating a SortedSet
SortedSet<int> numbers = new SortedSet<int>();
// Adding elements in random order
numbers.Add(5);
numbers.Add(2);
numbers.Add(10);
numbers.Add(1);
numbers.Add(5); // Duplicate won't be added
// Elements are automatically sorted
Console.WriteLine("Numbers in sorted order:");
foreach (var number in numbers)
{
Console.Write($"{number} ");
}
// Output: 1 2 5 10
Working with SortedSet
SortedSet<T>
supports the same operations as HashSet<T>
but maintains elements in sorted order:
SortedSet<string> sortedFruits = new SortedSet<string>
{
"Orange", "Apple", "Banana", "Grape"
};
// Elements are automatically sorted alphabetically
Console.WriteLine("Sorted fruits:");
foreach (var fruit in sortedFruits)
{
Console.WriteLine(fruit);
}
// Output:
// Apple
// Banana
// Grape
// Orange
Range Operations with SortedSet
One advantage of SortedSet<T>
is its ability to extract ranges:
SortedSet<int> sortedNumbers = new SortedSet<int> { 1, 3, 5, 7, 9, 11, 13, 15 };
// Get a subset view between 5 and 13 inclusive
var subset = sortedNumbers.GetViewBetween(5, 13);
Console.WriteLine("Numbers between 5 and 13:");
foreach (var number in subset)
{
Console.Write($"{number} ");
}
// Output: 5 7 9 11 13
Custom Objects in Sets
When storing custom objects in a set, you need to ensure that object equality and hash codes are properly defined. Otherwise, duplicate objects might not be detected correctly.
Example with Default Implementation (Flawed)
class Person
{
public string Name { get; set; }
public int Age { get; set; }
}
// Creating a set of Person objects
HashSet<Person> people = new HashSet<Person>();
// Adding seemingly duplicate objects
people.Add(new Person { Name = "John", Age = 30 });
people.Add(new Person { Name = "John", Age = 30 });
// Both objects are added because they're different instances
Console.WriteLine($"Count: {people.Count}"); // Output: 2
Solution: Implementing Equality
To fix this, we need to override Equals
and GetHashCode
in our custom class:
class Person : IEquatable<Person>
{
public string Name { get; set; }
public int Age { get; set; }
// Override Equals for IEquatable<Person>
public bool Equals(Person other)
{
if (other == null) return false;
return Name == other.Name && Age == other.Age;
}
// Override object.Equals
public override bool Equals(object obj)
{
return Equals(obj as Person);
}
// Override GetHashCode (important for HashSet!)
public override int GetHashCode()
{
return (Name, Age).GetHashCode();
}
}
// Now duplicates will be properly detected
HashSet<Person> people = new HashSet<Person>();
people.Add(new Person { Name = "John", Age = 30 });
people.Add(new Person { Name = "John", Age = 30 });
Console.WriteLine($"Count: {people.Count}"); // Output: 1
Real-world Applications of Sets
Sets are incredibly useful in many practical programming scenarios:
1. Removing Duplicates from a Collection
// Removing duplicates from a list
List<string> namesWithDuplicates = new List<string>
{
"Alice", "Bob", "Charlie", "Alice", "David", "Bob"
};
HashSet<string> uniqueNames = new HashSet<string>(namesWithDuplicates);
Console.WriteLine($"Original count: {namesWithDuplicates.Count}"); // 6
Console.WriteLine($"Unique count: {uniqueNames.Count}"); // 4
// Convert back to list if needed
List<string> uniqueNamesList = uniqueNames.ToList();
2. Tracking Visited Items
HashSet<string> visitedUrls = new HashSet<string>();
// Simulate web crawler
void CrawlPage(string url)
{
if (!visitedUrls.Add(url))
{
// URL was already in the set, we've visited it before
Console.WriteLine($"Already visited: {url}");
return;
}
Console.WriteLine($"Crawling: {url}");
// Process page...
}
// Usage
CrawlPage("https://example.com");
CrawlPage("https://example.com/about");
CrawlPage("https://example.com"); // This won't be processed again
3. Finding Common Elements
// Imagine tracking skills across job applicants
HashSet<string> requiredSkills = new HashSet<string>
{
"C#", "SQL", "JavaScript", "HTML", "CSS"
};
HashSet<string> candidateSkills = new HashSet<string>
{
"Java", "C#", "Python", "JavaScript"
};
// Find matching skills
HashSet<string> matchingSkills = new HashSet<string>(requiredSkills);
matchingSkills.IntersectWith(candidateSkills);
Console.WriteLine("Matching skills:");
foreach (var skill in matchingSkills)
{
Console.WriteLine(skill);
}
// Output: C#, JavaScript
// Find missing skills
HashSet<string> missingSkills = new HashSet<string>(requiredSkills);
missingSkills.ExceptWith(candidateSkills);
Console.WriteLine("Missing skills:");
foreach (var skill in missingSkills)
{
Console.WriteLine(skill);
}
// Output: SQL, HTML, CSS
4. Implementing a Dictionary Spell Checker
// Create a dictionary
HashSet<string> dictionary = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"apple", "banana", "orange", "grape", "programming", "computer"
};
bool CheckSpelling(string word)
{
return dictionary.Contains(word);
}
// Usage
Console.WriteLine($"Is 'apple' spelled correctly? {CheckSpelling("apple")}"); // True
Console.WriteLine($"Is 'Apple' spelled correctly? {CheckSpelling("Apple")}"); // True (case insensitive)
Console.WriteLine($"Is 'aple' spelled correctly? {CheckSpelling("aple")}"); // False
Performance Considerations
Sets in C# offer excellent performance for many operations:
-
HashSet<T>
:- Add, Remove, Contains: O(1) average case
- Union, Intersection, etc.: O(n) where n is the size of the smaller set
-
SortedSet<T>
:- Add, Remove, Contains: O(log n)
- Preserves order but is slightly slower than
HashSet<T>
Choose HashSet<T>
when you need maximum performance and don't care about order. Use SortedSet<T>
when you need elements to be automatically sorted.
Summary
Sets in C# provide powerful tools for working with unique elements and performing set operations:
HashSet<T>
offers fast, unordered storage of unique elementsSortedSet<T>
maintains elements in sorted order- Both support mathematical set operations like union, intersection, and difference
- Custom objects must override
Equals
andGetHashCode
to work properly in sets - Sets excel in scenarios involving uniqueness, removing duplicates, and set operations
Sets are often the best choice when you need to:
- Eliminate duplicates from a collection
- Check for existence quickly
- Find common or unique elements between collections
- Maintain a unique collection of items
Additional Resources
Exercises
- Create a program that takes two lists of integers and finds the intersection and union of them using sets.
- Implement a custom class
Student
with propertiesId
,Name
, andGPA
that works correctly in aHashSet
. - Write a program that reads a text file and counts the number of unique words using a set.
- Create a spell checker that suggests corrections for misspelled words by finding words in a dictionary within an edit distance of 1.
- Implement a program that finds all anagrams of a given word in a dictionary using sets.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)