C# Sets
Introduction
Sets are specialized collections in C# that store unique elements with no duplicates. They implement mathematical set operations like union, intersection, and difference. Unlike lists or arrays, sets don't maintain insertion order (except for SortedSet) and focus on fast lookups and eliminating duplicates.
In C#, sets are primarily represented by two collection types:
HashSet<T>: An unordered collection of unique elementsSortedSet<T>: An ordered collection of unique elements
In this guide, we'll explore how to work with sets in C#, providing practical examples and applications.
HashSet Overview
HashSet<T> is the most commonly used set implementation in C#. It provides:
- Fast operations (near O(1) for add, remove, and contains)
- Unordered storage of unique elements
- Implementation of standard set operations
Creating a HashSet
Let's start by creating and using a simple HashSet:
using System;
using System.Collections.Generic;
// Creating an empty HashSet
HashSet<string> fruits = new HashSet<string>();
// Adding elements
fruits.Add("Apple");
fruits.Add("Banana");
fruits.Add("Orange");
fruits.Add("Apple"); // This won't be added (duplicate)
// Displaying elements
Console.WriteLine("Fruits in the set:");
foreach (string fruit in fruits)
{
Console.WriteLine(fruit);
}
// Output:
// Fruits in the set:
// Apple
// Banana
// Orange
Notice how the duplicate "Apple" wasn't added to the set. This automatic duplicate removal is a key feature of sets.
Creating a HashSet from an Existing Collection
You can initialize a HashSet with values from another collection:
// Creating from an array
string[] fruitArray = { "Apple", "Banana", "Orange", "Apple", "Grape" };
HashSet<string> fruitSet = new HashSet<string>(fruitArray);
Console.WriteLine($"Number of elements in array: {fruitArray.Length}"); // 5
Console.WriteLine($"Number of elements in set: {fruitSet.Count}"); // 4
Basic HashSet Operations
Checking if an Element Exists
HashSet<string> fruits = new HashSet<string> { "Apple", "Banana", "Orange" };
bool containsApple = fruits.Contains("Apple");
bool containsMango = fruits.Contains("Mango");
Console.WriteLine($"Contains Apple? {containsApple}"); // True
Console.WriteLine($"Contains Mango? {containsMango}"); // False
Removing Elements
HashSet<string> fruits = new HashSet<string> { "Apple", "Banana", "Orange", "Grape" };
// Remove a specific element
bool removed = fruits.Remove("Banana");
Console.WriteLine($"Banana removed: {removed}"); // True
// Try removing an element that doesn't exist
removed = fruits.Remove("Mango");
Console.WriteLine($"Mango removed: {removed}"); // False
// Displaying remaining elements
Console.WriteLine("Remaining fruits:");
foreach (var fruit in fruits)
{
Console.WriteLine(fruit);
}
// Output:
// Apple
// Orange
// Grape
Clearing a HashSet
HashSet<string> fruits = new HashSet<string> { "Apple", "Banana", "Orange" };
Console.WriteLine($"Count before clear: {fruits.Count}"); // 3
fruits.Clear();
Console.WriteLine($"Count after clear: {fruits.Count}"); // 0
Set Operations
One of the most powerful aspects of HashSet<T> is its built-in support for mathematical set operations.
Union of Sets
The union combines all elements from two sets, removing duplicates.
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };
// Method 1: Using UnionWith (modifies set1)
set1.UnionWith(set2);
Console.WriteLine("Union using UnionWith:");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 1 2 3 4 5 6
// Method 2: Using LINQ (creates a new set)
HashSet<int> set3 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set4 = new HashSet<int> { 3, 4, 5, 6 };
var unionSet = new HashSet<int>(set3);
unionSet.UnionWith(set4);
Console.WriteLine("\nUnion using LINQ:");
foreach (var item in unionSet)
{
Console.Write($"{item} ");
}
// Output: 1 2 3 4 5 6
Intersection of Sets
Intersection keeps only elements that appear in both sets.
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };
// Using IntersectWith (modifies set1)
set1.IntersectWith(set2);
Console.WriteLine("Intersection:");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 3 4
Difference of Sets
The difference contains elements from the first set that don't appear in the second set.
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };
// Using ExceptWith (modifies set1)
set1.ExceptWith(set2);
Console.WriteLine("Difference (set1 - set2):");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 1 2
Symmetric Difference
Symmetric difference contains elements that are in either set but not in both.
HashSet<int> set1 = new HashSet<int> { 1, 2, 3, 4 };
HashSet<int> set2 = new HashSet<int> { 3, 4, 5, 6 };
// Using SymmetricExceptWith (modifies set1)
set1.SymmetricExceptWith(set2);
Console.WriteLine("Symmetric Difference:");
foreach (var item in set1)
{
Console.Write($"{item} ");
}
// Output: 1 2 5 6
Checking Subset and Superset Relationships
HashSet<int> smallSet = new HashSet<int> { 2, 3 };
HashSet<int> largeSet = new HashSet<int> { 1, 2, 3, 4, 5 };
bool isSubset = smallSet.IsSubsetOf(largeSet);
bool isProperSubset = smallSet.IsProperSubsetOf(largeSet);
bool isSuperset = largeSet.IsSupersetOf(smallSet);
Console.WriteLine($"smallSet is a subset of largeSet: {isSubset}"); // True
Console.WriteLine($"smallSet is a proper subset of largeSet: {isProperSubset}"); // True
Console.WriteLine($"largeSet is a superset of smallSet: {isSuperset}"); // True
SortedSet Overview
SortedSet<T> is similar to HashSet<T> but keeps elements in sorted order. This is useful when you need a collection of unique elements that are automatically sorted.
Creating a SortedSet
using System;
using System.Collections.Generic;
// Creating a SortedSet
SortedSet<int> numbers = new SortedSet<int>();
// Adding elements in random order
numbers.Add(5);
numbers.Add(2);
numbers.Add(10);
numbers.Add(1);
numbers.Add(5); // Duplicate won't be added
// Elements are automatically sorted
Console.WriteLine("Numbers in sorted order:");
foreach (var number in numbers)
{
Console.Write($"{number} ");
}
// Output: 1 2 5 10
Working with SortedSet
SortedSet<T> supports the same operations as HashSet<T> but maintains elements in sorted order:
SortedSet<string> sortedFruits = new SortedSet<string>
{
"Orange", "Apple", "Banana", "Grape"
};
// Elements are automatically sorted alphabetically
Console.WriteLine("Sorted fruits:");
foreach (var fruit in sortedFruits)
{
Console.WriteLine(fruit);
}
// Output:
// Apple
// Banana
// Grape
// Orange
Range Operations with SortedSet
One advantage of SortedSet<T> is its ability to extract ranges:
SortedSet<int> sortedNumbers = new SortedSet<int> { 1, 3, 5, 7, 9, 11, 13, 15 };
// Get a subset view between 5 and 13 inclusive
var subset = sortedNumbers.GetViewBetween(5, 13);
Console.WriteLine("Numbers between 5 and 13:");
foreach (var number in subset)
{
Console.Write($"{number} ");
}
// Output: 5 7 9 11 13
Custom Objects in Sets
When storing custom objects in a set, you need to ensure that object equality and hash codes are properly defined. Otherwise, duplicate objects might not be detected correctly.
Example with Default Implementation (Flawed)
class Person
{
public string Name { get; set; }
public int Age { get; set; }
}
// Creating a set of Person objects
HashSet<Person> people = new HashSet<Person>();
// Adding seemingly duplicate objects
people.Add(new Person { Name = "John", Age = 30 });
people.Add(new Person { Name = "John", Age = 30 });
// Both objects are added because they're different instances
Console.WriteLine($"Count: {people.Count}"); // Output: 2
Solution: Implementing Equality
To fix this, we need to override Equals and GetHashCode in our custom class:
class Person : IEquatable<Person>
{
public string Name { get; set; }
public int Age { get; set; }
// Override Equals for IEquatable<Person>
public bool Equals(Person other)
{
if (other == null) return false;
return Name == other.Name && Age == other.Age;
}
// Override object.Equals
public override bool Equals(object obj)
{
return Equals(obj as Person);
}
// Override GetHashCode (important for HashSet!)
public override int GetHashCode()
{
return (Name, Age).GetHashCode();
}
}
// Now duplicates will be properly detected
HashSet<Person> people = new HashSet<Person>();
people.Add(new Person { Name = "John", Age = 30 });
people.Add(new Person { Name = "John", Age = 30 });
Console.WriteLine($"Count: {people.Count}"); // Output: 1
Real-world Applications of Sets
Sets are incredibly useful in many practical programming scenarios:
1. Removing Duplicates from a Collection
// Removing duplicates from a list
List<string> namesWithDuplicates = new List<string>
{
"Alice", "Bob", "Charlie", "Alice", "David", "Bob"
};
HashSet<string> uniqueNames = new HashSet<string>(namesWithDuplicates);
Console.WriteLine($"Original count: {namesWithDuplicates.Count}"); // 6
Console.WriteLine($"Unique count: {uniqueNames.Count}"); // 4
// Convert back to list if needed
List<string> uniqueNamesList = uniqueNames.ToList();
2. Tracking Visited Items
HashSet<string> visitedUrls = new HashSet<string>();
// Simulate web crawler
void CrawlPage(string url)
{
if (!visitedUrls.Add(url))
{
// URL was already in the set, we've visited it before
Console.WriteLine($"Already visited: {url}");
return;
}
Console.WriteLine($"Crawling: {url}");
// Process page...
}
// Usage
CrawlPage("https://example.com");
CrawlPage("https://example.com/about");
CrawlPage("https://example.com"); // This won't be processed again
3. Finding Common Elements
// Imagine tracking skills across job applicants
HashSet<string> requiredSkills = new HashSet<string>
{
"C#", "SQL", "JavaScript", "HTML", "CSS"
};
HashSet<string> candidateSkills = new HashSet<string>
{
"Java", "C#", "Python", "JavaScript"
};
// Find matching skills
HashSet<string> matchingSkills = new HashSet<string>(requiredSkills);
matchingSkills.IntersectWith(candidateSkills);
Console.WriteLine("Matching skills:");
foreach (var skill in matchingSkills)
{
Console.WriteLine(skill);
}
// Output: C#, JavaScript
// Find missing skills
HashSet<string> missingSkills = new HashSet<string>(requiredSkills);
missingSkills.ExceptWith(candidateSkills);
Console.WriteLine("Missing skills:");
foreach (var skill in missingSkills)
{
Console.WriteLine(skill);
}
// Output: SQL, HTML, CSS
4. Implementing a Dictionary Spell Checker
// Create a dictionary
HashSet<string> dictionary = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
"apple", "banana", "orange", "grape", "programming", "computer"
};
bool CheckSpelling(string word)
{
return dictionary.Contains(word);
}
// Usage
Console.WriteLine($"Is 'apple' spelled correctly? {CheckSpelling("apple")}"); // True
Console.WriteLine($"Is 'Apple' spelled correctly? {CheckSpelling("Apple")}"); // True (case insensitive)
Console.WriteLine($"Is 'aple' spelled correctly? {CheckSpelling("aple")}"); // False
Performance Considerations
Sets in C# offer excellent performance for many operations:
-
HashSet<T>:- Add, Remove, Contains: O(1) average case
- Union, Intersection, etc.: O(n) where n is the size of the smaller set
-
SortedSet<T>:- Add, Remove, Contains: O(log n)
- Preserves order but is slightly slower than
HashSet<T>
Choose HashSet<T> when you need maximum performance and don't care about order. Use SortedSet<T> when you need elements to be automatically sorted.
Summary
Sets in C# provide powerful tools for working with unique elements and performing set operations:
HashSet<T>offers fast, unordered storage of unique elementsSortedSet<T>maintains elements in sorted order- Both support mathematical set operations like union, intersection, and difference
- Custom objects must override
EqualsandGetHashCodeto work properly in sets - Sets excel in scenarios involving uniqueness, removing duplicates, and set operations
Sets are often the best choice when you need to:
- Eliminate duplicates from a collection
- Check for existence quickly
- Find common or unique elements between collections
- Maintain a unique collection of items
Additional Resources
Exercises
- Create a program that takes two lists of integers and finds the intersection and union of them using sets.
- Implement a custom class
Studentwith propertiesId,Name, andGPAthat works correctly in aHashSet. - Write a program that reads a text file and counts the number of unique words using a set.
- Create a spell checker that suggests corrections for misspelled words by finding words in a dictionary within an edit distance of 1.
- Implement a program that finds all anagrams of a given word in a dictionary using sets.
💡 Found a typo or mistake? Click "Edit this page" to suggest a correction. Your feedback is greatly appreciated!