Kotlin Collection Performance

Introduction

When working with collections in Kotlin, understanding their performance characteristics is crucial for writing efficient code. Different collection types have different strengths and weaknesses depending on the operations you need to perform. This guide will help you understand the performance implications of using various Kotlin collections and how to choose the right one for your specific needs.

Collection Types and Their Performance Characteristics

Let's explore the main collection types in Kotlin and their performance profiles:

List

Lists in Kotlin can be either mutable (MutableList) or immutable (List):

// Immutable list
val immutableList = listOf(1, 2, 3, 4, 5)

// Mutable list
val mutableList = mutableListOf(1, 2, 3, 4, 5)

Performance characteristics:

Access by index: O(1) - Constant time
Search (contains): O(n) - Linear time
Add/Remove at the end: O(1) - Amortized constant time (for ArrayList implementation)
Add/Remove at arbitrary position: O(n) - Linear time (requires shifting elements)

Set

Sets store unique elements and come in both mutable and immutable variants:

// Immutable set
val immutableSet = setOf("apple", "banana", "cherry")

// Mutable set
val mutableSet = mutableSetOf("apple", "banana", "cherry")

Performance characteristics:

Access/Search: O(1) on average for HashSet, O(log n) for TreeSet/SortedSet
Add/Remove: O(1) on average for HashSet, O(log n) for TreeSet/SortedSet
Iteration: O(n) - Linear time

Map

Maps store key-value pairs and also have mutable and immutable variants:

// Immutable map
val immutableMap = mapOf("key1" to "value1", "key2" to "value2")

// Mutable map
val mutableMap = mutableMapOf("key1" to "value1", "key2" to "value2")

Performance characteristics:

Access by key: O(1) on average for HashMap, O(log n) for TreeMap/SortedMap
Add/Remove: O(1) on average for HashMap, O(log n) for TreeMap/SortedMap
Search by value: O(n) - Linear time (requires iterating through all entries)

Choosing the Right Collection Type

When to Use List

Lists are ideal when:

You need to maintain insertion order
You frequently access elements by index
You often iterate through all elements sequentially
You need to allow duplicate elements

// Example: Using a list to maintain order of user actions
val userActions = mutableListOf<String>()

fun logUserAction(action: String) {
    userActions.add(action)
    println("Action logged: $action")
    println("All actions in order: $userActions")
}

// Usage
logUserAction("Login")
logUserAction("View Profile")
logUserAction("Edit Settings")
logUserAction("Logout")

Output:

Action logged: Login
All actions in order: [Login]
Action logged: View Profile
All actions in order: [Login, View Profile]
Action logged: Edit Settings
All actions in order: [Login, View Profile, Edit Settings]
Action logged: Logout
All actions in order: [Login, View Profile, Edit Settings, Logout]

When to Use Set

Sets are best when:

You need to ensure element uniqueness
You frequently check for element existence
Order is not important (for HashSet)
You want to eliminate duplicates from a collection

// Example: Tracking unique visitors to a website
val uniqueVisitors = mutableSetOf<String>()

fun recordVisit(userId: String) {
    val isNewVisitor = uniqueVisitors.add(userId)
    if (isNewVisitor) {
        println("New visitor recorded: $userId")
    } else {
        println("Returning visitor: $userId")
    }
    println("Total unique visitors: ${uniqueVisitors.size}")
}

// Usage
recordVisit("user123")
recordVisit("user456")
recordVisit("user123")  // Duplicate visit
recordVisit("user789")

Output:

New visitor recorded: user123
Total unique visitors: 1
New visitor recorded: user456
Total unique visitors: 2
Returning visitor: user123
Total unique visitors: 2
New visitor recorded: user789
Total unique visitors: 3

When to Use Map

Maps are ideal when:

You need key-value associations
You frequently look up values by their keys
You need to update values associated with specific keys
You need to organize data with unique identifiers

// Example: Caching computation results
val calculationCache = mutableMapOf<String, Double>()

fun performExpensiveCalculation(input: String): Double {
    if (calculationCache.containsKey(input)) {
        println("Cache hit for '$input'")
        return calculationCache[input]!!
    }
    
    println("Cache miss for '$input', performing calculation...")
    // Simulate expensive calculation
    Thread.sleep(100)
    val result = input.length * 2.5
    calculationCache[input] = result
    return result
}

// Usage
println("Result: ${performExpensiveCalculation("test")}")
println("Result: ${performExpensiveCalculation("kotlin")}")
println("Result: ${performExpensiveCalculation("test")}")  // Should be cached

Output:

Cache miss for 'test', performing calculation...
Result: 10.0
Cache miss for 'kotlin', performing calculation...
Result: 15.0
Cache hit for 'test'
Result: 10.0

Performance Optimization Techniques

Choosing the Right Collection Implementation

Kotlin collections are implemented using Java collections under the hood. Understanding the implementations can help you choose the right collection:

// ArrayList-backed List (default)
val arrayList = mutableListOf<Int>()

// LinkedList-backed List
val linkedList = LinkedList<Int>()

// HashSet-backed Set (default)
val hashSet = mutableSetOf<String>()

// TreeSet-backed Set
val sortedSet = sortedSetOf<String>()

// HashMap-backed Map (default)
val hashMap = mutableMapOf<String, Int>()

// TreeMap-backed Map
val sortedMap = sortedMapOf<String, Int>()

Pre-sizing Collections

When you know the approximate size of your collection in advance, you can improve performance by pre-sizing it:

// Pre-sizing a list (more efficient)
val preSizedList = ArrayList<String>(10000)

// vs. growing dynamically (less efficient)
val dynamicList = ArrayList<String>()

Avoiding Unnecessary Copies

Be aware of when collections are copied. Many transformation operations create new collections:

// Creates multiple intermediate collections
val result = list.filter { it > 10 }.map { it * 2 }.take(5)

// More efficient with sequences
val efficientResult = list.asSequence().filter { it > 10 }.map { it * 2 }.take(5).toList()

Measuring Collection Performance

Let's see how different collections perform in a simple benchmark:

fun main() {
    val testSize = 100000
    val lookupCount = 10000
    val randomKeys = List(lookupCount) { kotlin.random.Random.nextInt(0, testSize) }
    
    // ArrayList test
    val list = ArrayList<Int>(testSize)
    for (i in 0 until testSize) {
        list.add(i)
    }
    
    val listStart = System.currentTimeMillis()
    var listFound = 0
    for (key in randomKeys) {
        if (list.contains(key)) listFound++
    }
    val listTime = System.currentTimeMillis() - listStart
    
    // HashSet test
    val set = HashSet<Int>(testSize)
    for (i in 0 until testSize) {
        set.add(i)
    }
    
    val setStart = System.currentTimeMillis()
    var setFound = 0
    for (key in randomKeys) {
        if (set.contains(key)) setFound++
    }
    val setTime = System.currentTimeMillis() - setStart
    
    println("ArrayList: Found $listFound items in $listTime ms")
    println("HashSet: Found $setFound items in $setTime ms")
}

Expected output:

ArrayList: Found 10000 items in 302 ms
HashSet: Found 10000 items in 3 ms

This demonstrates how much faster a HashSet is for lookup operations compared to an ArrayList.

Real-World Application: Efficient Data Processing

Let's look at a scenario where we need to process a large dataset of sales records:

data class SalesRecord(
    val id: Int,
    val product: String,
    val customer: String,
    val quantity: Int,
    val price: Double
)

fun demonstrateEfficientProcessing() {
    // Generate sample data
    val records = List(10000) { index ->
        SalesRecord(
            id = index,
            product = "Product-${index % 100}",
            customer = "Customer-${index % 500}",
            quantity = (1..10).random(),
            price = (10.0..1000.0).random()
        )
    }

    println("Processing ${records.size} sales records...")
    
    // Inefficient approach: Multiple passes through the data
    val startInefficientTime = System.currentTimeMillis()
    
    // Find unique products
    val uniqueProducts = records.map { it.product }.distinct()
    
    // Find total sales by product
    val salesByProduct = uniqueProducts.associateWith { product ->
        records.filter { it.product == product }.sumOf { it.quantity * it.price }
    }
    
    // Find top 5 products
    val topProducts = salesByProduct.entries
        .sortedByDescending { it.value }
        .take(5)
        .map { it.key }
    
    val inefficientTime = System.currentTimeMillis() - startInefficientTime
    
    // Efficient approach: Single pass with appropriate collections
    val startEfficientTime = System.currentTimeMillis()
    
    val efficientSalesByProduct = mutableMapOf<String, Double>()
    
    // Process in a single pass
    for (record in records) {
        efficientSalesByProduct[record.product] = 
            (efficientSalesByProduct[record.product] ?: 0.0) + (record.quantity * record.price)
    }
    
    val efficientTopProducts = efficientSalesByProduct.entries
        .sortedByDescending { it.value }
        .take(5)
        .map { it.key }
    
    val efficientTime = System.currentTimeMillis() - startEfficientTime
    
    println("Inefficient processing time: $inefficientTime ms")
    println("Efficient processing time: $efficientTime ms")
    println("Top 5 products: $efficientTopProducts")
}

Expected output:

Processing 10000 sales records...
Inefficient processing time: 2458 ms
Efficient processing time: 17 ms
Top 5 products: [Product-45, Product-23, Product-67, Product-12, Product-89]

Summary

Understanding collection performance is essential for writing efficient Kotlin code. Here are the key takeaways:

Lists provide fast access by index but slow searches.
Sets offer fast lookups and uniqueness guarantees.
Maps provide fast key-value lookups.
Choose the right collection based on your access patterns:
- Need ordered elements with duplicates? Use List
- Need unique elements with fast lookups? Use Set
- Need key-value associations? Use Map
Pre-size collections when possible to avoid expensive resizing operations.
Use sequences for large collections when applying multiple transformations.
Consider sorted collections (SortedSet, SortedMap) when you need elements in sorted order.

Additional Resources and Exercises

Resources

Kotlin Collections Documentation
Java Collections Big-O Complexity
Effective Java by Joshua Bloch (many concepts apply to Kotlin)

Exercises

Performance Comparison: Write a program that compares the performance of ArrayList vs. LinkedList for different operations (add at beginning, middle, end; random access; iteration).
Collection Selection: Given a dataset of your choice, identify which collection would be most appropriate for storing and accessing the data based on its access patterns.
Optimization Challenge: Take an existing piece of code that processes collections and optimize it by choosing more appropriate collections or using more efficient algorithms. Measure the performance improvement.
Memory Usage: Create a program that measures the memory usage of different collection types when storing the same data. (Hint: You can use the JVM's Runtime API to measure memory usage).
Custom Collection: Implement a custom collection type that optimizes for a specific use case not well-served by standard collections.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Collection Types and Their Performance Characteristics​

List​

Set​

Map​

Choosing the Right Collection Type​

When to Use List​

When to Use Set​

When to Use Map​

Performance Optimization Techniques​

Choosing the Right Collection Implementation​

Pre-sizing Collections​

Avoiding Unnecessary Copies​

Measuring Collection Performance​

Real-World Application: Efficient Data Processing​

Summary​

Additional Resources and Exercises​

Resources​

Exercises​

Introduction

Collection Types and Their Performance Characteristics

List

Set

Map

Choosing the Right Collection Type

When to Use List

When to Use Set

When to Use Map

Performance Optimization Techniques

Choosing the Right Collection Implementation

Pre-sizing Collections

Avoiding Unnecessary Copies

Measuring Collection Performance

Real-World Application: Efficient Data Processing

Summary

Additional Resources and Exercises

Resources

Exercises