Skip to main content

PyTorch Kornia

Introduction

Kornia is an open-source computer vision library built on top of PyTorch that provides differentiable implementations of computer vision algorithms. It's designed to solve generic computer vision problems and enables you to use these operations as part of your deep learning models. What makes Kornia special is that all of its operations are differentiable, meaning they can be used seamlessly within your PyTorch neural networks and allow for end-to-end training.

In this guide, we'll explore the basics of Kornia, its key features, and how to use it for various computer vision tasks.

Getting Started with Kornia

Installation

Before we dive into using Kornia, let's install it:

bash
pip install kornia

Basic Imports

To start working with Kornia, you need to import it along with PyTorch:

python
import torch
import kornia
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image

Core Features of Kornia

Kornia provides several modules that can be used for different computer vision tasks:

  1. Image Processing: Basic operations like color conversion, filtering, and geometric transformations
  2. Feature Detection and Description: Edge detection, corners, and feature descriptors
  3. Augmentation: Data augmentation for training deep learning models
  4. Depth Estimation: Tools for depth estimation and 3D reconstruction
  5. Optical Flow: Algorithms for computing optical flow

Let's explore some of these features in more detail.

Image Processing with Kornia

Loading and Converting Images

Let's start by loading an image and converting it to the format expected by Kornia:

python
import requests
from io import BytesIO
import torch
import kornia as K
import matplotlib.pyplot as plt

# Download a sample image
url = "https://raw.githubusercontent.com/kornia/kornia/master/docs/source/_static/img/lena.png"
response = requests.get(url)
img_pil = Image.open(BytesIO(response.content))

# Convert PIL image to tensor
img_tensor = K.image_to_tensor(np.array(img_pil)) # HWC -> CHW, scale to [0, 1]
img_tensor = img_tensor.float() / 255.0

print(f"Image tensor shape: {img_tensor.shape}")

# Visualize the image
plt.imshow(K.tensor_to_image(img_tensor))
plt.axis('off')
plt.show()

Output:

Image tensor shape: torch.Size([1, 3, 512, 512])

Basic Image Transformations

Kornia provides many basic image transformations. Let's see some examples:

Grayscale Conversion

python
# Convert to grayscale
gray_tensor = K.color.rgb_to_grayscale(img_tensor)

plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original Image")
plt.axis('off')

plt.subplot(1, 2, 2)
plt.imshow(K.tensor_to_image(gray_tensor), cmap='gray')
plt.title("Grayscale Image")
plt.axis('off')
plt.show()

print(f"Grayscale tensor shape: {gray_tensor.shape}")

Output:

Grayscale tensor shape: torch.Size([1, 1, 512, 512])

Rotation and Flipping

python
# Rotate image by 45 degrees
rotated = K.geometry.rotate(img_tensor, torch.tensor([45.0]))

# Flip image horizontally
flipped = K.geometry.hflip(img_tensor)

plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original")
plt.axis('off')

plt.subplot(1, 3, 2)
plt.imshow(K.tensor_to_image(rotated))
plt.title("Rotated 45°")
plt.axis('off')

plt.subplot(1, 3, 3)
plt.imshow(K.tensor_to_image(flipped))
plt.title("Horizontal Flip")
plt.axis('off')
plt.show()

Image Filtering and Edge Detection

Kornia makes it easy to apply various filters to your images and perform edge detection:

Blur and Sharpening

python
# Apply Gaussian blur
blurred = K.filters.gaussian_blur2d(img_tensor, (15, 15), (7.5, 7.5))

# Create a sharpening kernel
sharpen_kernel = torch.tensor([
[0, -1, 0],
[-1, 5, -1],
[0, -1, 0]
]).float().unsqueeze(0).unsqueeze(0)

# Expand kernel for RGB image
sharpen_kernel = sharpen_kernel.repeat(3, 1, 1, 1)

# Apply sharpening filter
sharpened = K.filters.filter2d(img_tensor, sharpen_kernel, border_type='replicate')

plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original")
plt.axis('off')

plt.subplot(1, 3, 2)
plt.imshow(K.tensor_to_image(blurred))
plt.title("Gaussian Blur")
plt.axis('off')

plt.subplot(1, 3, 3)
plt.imshow(K.tensor_to_image(sharpened))
plt.title("Sharpened")
plt.axis('off')
plt.show()

Edge Detection

python
# Convert to grayscale for edge detection
gray_tensor = K.color.rgb_to_grayscale(img_tensor)

# Apply Sobel filter
sobel_xy = K.filters.spatial_gradient(gray_tensor, mode='sobel', order=1)
sobel_mag = torch.sqrt(sobel_xy[:, :, 0] ** 2 + sobel_xy[:, :, 1] ** 2)

# Apply Canny edge detector
canny_edges = K.filters.canny(gray_tensor)[0]

plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original")
plt.axis('off')

plt.subplot(1, 3, 2)
plt.imshow(K.tensor_to_image(sobel_mag), cmap='viridis')
plt.title("Sobel Edges")
plt.axis('off')

plt.subplot(1, 3, 3)
plt.imshow(K.tensor_to_image(canny_edges), cmap='gray')
plt.title("Canny Edges")
plt.axis('off')
plt.show()

Data Augmentation with Kornia

One of the most powerful features of Kornia is its ability to perform differentiable data augmentation. This means you can include these augmentations as part of your model's training pipeline:

python
import kornia.augmentation as K

# Create a batch of images by duplicating our image 4 times
batch_tensor = img_tensor.repeat(4, 1, 1, 1)
print(f"Batch tensor shape: {batch_tensor.shape}")

# Define a set of augmentations
augmentations = K.AugmentationSequential(
K.RandomHorizontalFlip(p=0.5),
K.RandomRotation(degrees=30.0, p=0.5),
K.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1, p=0.5),
K.RandomPerspective(distortion_scale=0.2, p=0.5),
data_keys=["input"]
)

# Apply augmentations
augmented_batch = augmentations(batch_tensor)

# Display augmented images
plt.figure(figsize=(15, 8))
for i in range(4):
plt.subplot(2, 2, i+1)
plt.imshow(K.tensor_to_image(augmented_batch[i]))
plt.title(f"Augmented Image {i+1}")
plt.axis('off')
plt.tight_layout()
plt.show()

Output:

Batch tensor shape: torch.Size([4, 3, 512, 512])

Practical Example: Image Classification with Kornia Augmentations

Let's see how to use Kornia augmentations in a PyTorch classification pipeline:

python
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.models as models
import kornia.augmentation as K

# Define a simple classification model
model = models.resnet18(pretrained=True)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Define augmentation pipeline with Kornia
train_aug = K.AugmentationSequential(
K.RandomHorizontalFlip(p=0.5),
K.RandomRotation(degrees=10.0),
K.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1),
K.RandomErasing(p=0.2),
data_keys=["input"]
)

# Example training loop (simplified)
def train_batch(model, images, labels):
# Apply augmentations
images_aug = train_aug(images)

# Forward pass
outputs = model(images_aug)
loss = criterion(outputs, labels)

# Backward pass and optimize
optimizer.zero_grad()
loss.backward()
optimizer.step()

return loss.item()

# Example usage (would be part of a larger training loop)
# train_batch(model, batch_of_images, batch_of_labels)

Homography and Perspective Transformations

Kornia provides tools for geometric transformations like homography:

python
# Define source points (4 corners of the image)
h, w = img_tensor.shape[-2:]
points_src = torch.tensor([[
[0, 0],
[w-1, 0],
[w-1, h-1],
[0, h-1]
]], dtype=torch.float32)

# Define destination points (move corners to create perspective effect)
points_dst = torch.tensor([[
[w*0.1, h*0.1], # Top-left
[w*0.9, h*0.2], # Top-right
[w*0.8, h*0.8], # Bottom-right
[w*0.2, h*0.9] # Bottom-left
]], dtype=torch.float32)

# Compute homography
H = K.geometry.transform.get_perspective_transform(points_src, points_dst)

# Apply perspective transformation
warped_img = K.geometry.transform.warp_perspective(img_tensor, H, (h, w))

plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.imshow(K.tensor_to_image(img_tensor))
plt.title("Original Image")
plt.axis('off')

plt.subplot(1, 2, 2)
plt.imshow(K.tensor_to_image(warped_img))
plt.title("Perspective Transform")
plt.axis('off')
plt.show()

Summary

In this guide, we've explored Kornia, a powerful computer vision library built on top of PyTorch. We've seen how it provides differentiable implementations of various image processing and computer vision algorithms, making it ideal for deep learning workflows.

Key features we've covered include:

  1. Basic image processing (color conversion, geometric transformations)
  2. Image filtering and edge detection
  3. Data augmentation for deep learning
  4. Homography and perspective transformations

The main advantage of Kornia is that all operations are differentiable, which means they can be part of your neural network's computation graph and allow for end-to-end training.

Additional Resources

Exercises

  1. Load your own image and apply different Kornia transformations such as rotations, color changes, and edge detection.
  2. Create a custom data augmentation pipeline using Kornia and apply it to a dataset.
  3. Implement a simple image classification model with PyTorch that uses Kornia for data augmentation.
  4. Explore Kornia's feature detection modules and try to match keypoints between two similar images.
  5. Try to create a simple image stylization pipeline using Kornia's filtering capabilities.


If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)