Skip to main content

Swift Copy-on-Write

Copy-on-write (CoW) is a memory optimization technique used by Swift to efficiently handle value types. It's an essential concept to understand if you want to write memory-efficient Swift code.

What is Copy-on-Write?

In Swift, when you assign a value type (like an array, dictionary, or struct) to a new variable, conceptually, a copy is created. However, creating a full copy each time could be expensive, especially for large collections.

This is where copy-on-write comes in:

  1. When you assign a value type to a new variable, Swift doesn't immediately create a full copy
  2. Instead, both variables reference the same underlying data
  3. When you modify one of the variables, Swift detects this and only then creates a separate copy

This approach gives you the safety of value types with performance closer to reference types in many cases.

How Copy-on-Write Works in Swift

Let's see copy-on-write in action with arrays:

swift
// Create an array
var originalArray = [1, 2, 3, 4, 5]

// Create a "copy" by assignment
var copiedArray = originalArray

// Both arrays are logically separate value types
print("Original: \(originalArray)")
print("Copy: \(copiedArray)")

// Behind the scenes, they share the same storage until modification

Output:

Original: [1, 2, 3, 4, 5]
Copy: [1, 2, 3, 4, 5]

When we modify one of the arrays, Swift creates an actual copy:

swift
// Modify the copied array
copiedArray.append(6)

print("After modification:")
print("Original: \(originalArray)")
print("Copy: \(copiedArray)")

Output:

After modification:
Original: [1, 2, 3, 4, 5]
Copy: [1, 2, 3, 4, 5, 6]

Demonstrating Copy-on-Write with Memory Addresses

We can demonstrate what's happening behind the scenes by checking memory addresses:

swift
func addressOf<T>(_ value: UnsafePointer<T>) -> String {
return String(format: "%p", Int(bitPattern: value))
}

func memoryAddress(of object: UnsafeRawPointer) -> String {
return String(format: "%p", Int(bitPattern: object))
}

var array1 = [1, 2, 3, 4, 5]

// Force the array to be stored in memory
array1.withUnsafeMutableBufferPointer { buffer in
let address1 = memoryAddress(of: buffer.baseAddress!)
print("Array1 storage address: \(address1)")

// Create a "copy"
var array2 = array1

array2.withUnsafeMutableBufferPointer { buffer2 in
let address2 = memoryAddress(of: buffer2.baseAddress!)
print("Array2 storage address before modification: \(address2)")

// Addresses are the same! Both arrays share storage

// Now modify the second array
array2.append(6)

array2.withUnsafeMutableBufferPointer { buffer3 in
let address3 = memoryAddress(of: buffer3.baseAddress!)
print("Array2 storage address after modification: \(address3)")
// Address is different - a copy was made
}
}
}

The output will show different memory addresses before and after modification, demonstrating when the copy actually occurs.

Built-in Value Types with Copy-on-Write

Swift implements copy-on-write for many of its standard library value types:

  • Arrays
  • Dictionaries
  • Sets
  • Strings
  • Data

These types all benefit from this optimization automatically.

Implementing Copy-on-Write in Custom Types

You can implement copy-on-write for your own custom types. Here's a simple example:

swift
// A wrapper class to store our actual data
final class Storage<T> {
var value: T

init(_ value: T) {
self.value = value
}
}

// Our custom struct that uses CoW
struct Box<T> {
// Private reference to our storage
private var storage: Storage<T>

init(_ value: T) {
storage = Storage(value)
}

// Read-only access doesn't need to copy
var value: T {
get {
return storage.value
}
set {
// Only create a new storage if there are multiple references
if isKnownUniquelyReferenced(&storage) {
// We uniquely own this storage, so modify in place
storage.value = newValue
} else {
// Create a new copy of storage
storage = Storage(newValue)
}
}
}
}

// Let's test our implementation
var box1 = Box("Hello")
var box2 = box1 // No copy is made yet

// Modify box2
box2.value = "World"

print(box1.value) // "Hello"
print(box2.value) // "World"

The key function here is isKnownUniquelyReferenced(_:), which tells us if our reference is the only one to that storage.

When Copy-on-Write Happens

Copy-on-write is triggered:

  1. When you modify a value type that shares storage with another variable
  2. Only for mutable operations (like append, remove, or changing elements)
  3. Only when there are multiple references to the same storage

Performance Implications

Copy-on-write provides significant performance benefits:

  • Reduced Memory Usage: Multiple copies of the same data share storage until modification
  • Lazy Copying: Expensive copy operations happen only when needed
  • Efficiency: Read operations don't trigger copies

However, there are considerations:

  • There's a small overhead for reference counting
  • When modification happens, copying can be expensive for large collections
  • Multiple small mutations after a copy can be less efficient than one batch operation

Best Practices

To get the most out of copy-on-write:

  1. Batch modifications when possible to avoid multiple copy operations
  2. Consider inout parameters for functions that modify large value types
  3. Be aware of hidden copies in loops and iterations
  4. Use value semantics confidently, knowing Swift optimizes them

Practical Example: Processing a Large Dataset

Here's a real-world example where copy-on-write can make a difference:

swift
func processUserData(users: [User]) -> [User] {
// Instead of creating multiple copies...
var result = users // Shares storage with users parameter

// Batch all modifications together when possible
result = result.map { user in
var updatedUser = user
updatedUser.lastLoginDate = Date()
updatedUser.loginCount += 1
return updatedUser
}

return result
}

By understanding copy-on-write, we know:

  1. The initial assignment creates no copy
  2. Only one copy operation happens when we modify the array
  3. The function returns efficiently even with large datasets

Summary

Swift's copy-on-write mechanism provides an elegant balance between the safety of value types and the performance of reference types. Key points to remember:

  • Copy-on-write delays copying until mutation occurs
  • It's automatic for standard library types like Array, Dictionary, and String
  • You can implement it for your own custom types using isKnownUniquelyReferenced
  • It optimizes memory usage by sharing storage when possible
  • Understanding CoW helps you write more efficient Swift code

Additional Resources

Exercises

  1. Create a simple benchmark that compares the performance of copy-on-write vs. always copying for arrays of different sizes
  2. Implement a custom collection type that uses copy-on-write
  3. Analyze a Swift application and identify places where copy-on-write might be happening
  4. Experiment with different batch sizes for operations on large arrays to find the optimal approach

By mastering copy-on-write, you'll be able to write Swift code that's both safe and efficient, getting the best of both worlds from Swift's type system.



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)