PyTorch Tensor Operations
Introduction
Tensors are the fundamental data structures in PyTorch, similar to NumPy arrays but with additional capabilities for GPU acceleration and automatic differentiation. Being able to manipulate tensors effectively is crucial for building and training neural networks. In this tutorial, we'll explore the various operations you can perform on PyTorch tensors, from basic arithmetic to more complex transformations.
Basic Tensor Operations
Let's start by importing PyTorch and creating some tensors to work with.
import torch
# Create some tensors
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])
print(f"Tensor a: {a}")
print(f"Tensor b: {b}")
Output:
Tensor a: tensor([1, 2, 3])
Tensor b: tensor([4, 5, 6])
Arithmetic Operations
PyTorch supports all standard arithmetic operations on tensors.
# Addition
c = a + b # Equivalent to torch.add(a, b)
print(f"a + b = {c}")
# Subtraction
d = a - b # Equivalent to torch.sub(a, b)
print(f"a - b = {d}")
# Multiplication (element-wise)
e = a * b # Equivalent to torch.mul(a, b)
print(f"a * b = {e}")
# Division (element-wise)
f = a / b # Equivalent to torch.div(a, b)
print(f"a / b = {f}")
# Power
g = a ** 2 # Equivalent to torch.pow(a, 2)
print(f"a^2 = {g}")
Output:
a + b = tensor([5, 7, 9])
a - b = tensor([-3, -3, -3])
a * b = tensor([4, 10, 18])
a / b = tensor([0.2500, 0.4000, 0.5000])
a^2 = tensor([1, 4, 9])
In-place Operations
PyTorch allows for in-place operations that modify the tensor directly, which can be more memory-efficient. These operations are denoted with a trailing underscore.
# Create a copy of 'a' to preserve the original
a_copy = a.clone()
print(f"Original a_copy: {a_copy}")
# In-place addition
a_copy.add_(b)
print(f"After a_copy.add_(b): {a_copy}")
# Reset
a_copy = a.clone()
# In-place multiplication
a_copy.mul_(2)
print(f"After a_copy.mul_(2): {a_copy}")
Output:
Original a_copy: tensor([1, 2, 3])
After a_copy.add_(b): tensor([5, 7, 9])
After a_copy.mul_(2): tensor([2, 4, 6])
Tensor Reshaping and Manipulation
Reshaping Tensors
You can change the shape of a tensor using various methods:
# Create a tensor
h = torch.tensor([1, 2, 3, 4, 5, 6])
print(f"Original tensor h: {h}, shape: {h.shape}")
# Reshape using .view() method
h_reshaped = h.view(2, 3)
print(f"Reshaped h: {h_reshaped}, shape: {h_reshaped.shape}")
# Reshape using .reshape() method
h_reshaped2 = h.reshape(3, 2)
print(f"Reshaped h (alternate): {h_reshaped2}, shape: {h_reshaped2.shape}")
# Using -1 for automatic dimension calculation
h_reshaped3 = h.view(-1, 2) # Automatically determines the first dimension
print(f"Auto-reshaped h: {h_reshaped3}, shape: {h_reshaped3.shape}")
Output:
Original tensor h: tensor([1, 2, 3, 4, 5, 6]), shape: torch.Size([6])
Reshaped h: tensor([[1, 2, 3],
[4, 5, 6]]), shape: torch.Size([2, 3])
Reshaped h (alternate): tensor([[1, 2],
[3, 4],
[5, 6]]), shape: torch.Size([3, 2])
Auto-reshaped h: tensor([[1, 2],
[3, 4],
[5, 6]]), shape: torch.Size([3, 2])
Note: The main difference between
view()
andreshape()
is thatview()
returns a tensor that shares the same underlying data with the original tensor, whilereshape()
may return a copy if necessary.
Squeezing and Unsqueezing
These operations are used to add or remove dimensions of size 1.
# Create a tensor with an extra dimension
i = torch.tensor([[1, 2, 3]])
print(f"Original tensor i: {i}, shape: {i.shape}")
# Remove dimensions of size 1
i_squeezed = i.squeeze()
print(f"Squeezed i: {i_squeezed}, shape: {i_squeezed.shape}")
# Add a dimension
j = torch.tensor([1, 2, 3])
j_unsqueezed = j.unsqueeze(0) # Add dimension at index 0
print(f"Original j: {j}, shape: {j.shape}")
print(f"Unsqueezed j: {j_unsqueezed}, shape: {j_unsqueezed.shape}")
# Add dimension at a different position
j_unsqueezed1 = j.unsqueeze(1) # Add dimension at index 1
print(f"j unsqueezed at dim 1: {j_unsqueezed1}, shape: {j_unsqueezed1.shape}")
Output:
Original tensor i: tensor([[1, 2, 3]]), shape: torch.Size([1, 3])
Squeezed i: tensor([1, 2, 3]), shape: torch.Size([3])
Original j: tensor([1, 2, 3]), shape: torch.Size([3])
Unsqueezed j: tensor([[1, 2, 3]]), shape: torch.Size([1, 3])
j unsqueezed at dim 1: tensor([[1], [2], [3]]), shape: torch.Size([3, 1])
Concatenation and Stacking
You can combine multiple tensors using concatenation or stacking operations.
# Concatenate tensors along a dimension
tensor1 = torch.tensor([[1, 2], [3, 4]])
tensor2 = torch.tensor([[5, 6], [7, 8]])
cat_dim0 = torch.cat((tensor1, tensor2), dim=0)
cat_dim1 = torch.cat((tensor1, tensor2), dim=1)
print(f"Tensor 1:\n{tensor1}")
print(f"Tensor 2:\n{tensor2}")
print(f"Concatenated along dim 0:\n{cat_dim0}")
print(f"Concatenated along dim 1:\n{cat_dim1}")
# Stack tensors to create a new dimension
stacked = torch.stack([tensor1, tensor2])
print(f"Stacked tensors:\n{stacked}")
print(f"Stacked shape: {stacked.shape}")
Output:
Tensor 1:
tensor([[1, 2],
[3, 4]])
Tensor 2:
tensor([[5, 6],
[7, 8]])
Concatenated along dim 0:
tensor([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
Concatenated along dim 1:
tensor([[1, 2, 5, 6],
[3, 4, 7, 8]])
Stacked tensors:
tensor([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
Stacked shape: torch.Size([2, 2, 2])
Tensor Indexing and Slicing
PyTorch tensors support similar indexing and slicing operations as NumPy arrays.
# Create a tensor
k = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"Original tensor k:\n{k}")
# Accessing elements
print(f"First row: {k[0]}")
print(f"Element at position (1, 2): {k[1, 2]}")
# Slicing
print(f"First two rows:\n{k[:2]}")
print(f"Last two columns:\n{k[:, 1:]}")
print(f"Sub-matrix (top-left 2x2):\n{k[:2, :2]}")
Output:
Original tensor k:
tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
First row: tensor([1, 2, 3])
Element at position (1, 2): 6
First two rows:
tensor([[1, 2, 3],
[4, 5, 6]])
Last two columns:
tensor([[2, 3],
[5, 6],
[8, 9]])
Sub-matrix (top-left 2x2):
tensor([[1, 2],
[4, 5]])
Advanced Indexing
PyTorch also supports boolean and integer array indexing.
# Boolean indexing
mask = k > 5
print(f"Mask where k > 5:\n{mask}")
print(f"Elements where k > 5: {k[mask]}")
# Integer array indexing
indices = torch.tensor([0, 2])
print(f"First and third rows:\n{k[indices]}")
Output:
Mask where k > 5:
tensor([[False, False, False],
[False, False, True],
[True, True, True]])
Elements where k > 5: tensor([6, 7, 8, 9])
First and third rows:
tensor([[1, 2, 3],
[7, 8, 9]])
Mathematical Operations
PyTorch provides a variety of mathematical functions for tensor operations.
Statistical Functions
# Create a tensor
m = torch.tensor([[1, 2, 3], [4, 5, 6]])
print(f"Tensor m:\n{m}")
# Sum
print(f"Sum of all elements: {m.sum()}")
print(f"Sum along rows (dim=0):\n{m.sum(dim=0)}")
print(f"Sum along columns (dim=1): {m.sum(dim=1)}")
# Mean
print(f"Mean of all elements: {m.mean()}")
print(f"Mean along rows (dim=0):\n{m.mean(dim=0)}")
# Min and Max
print(f"Minimum value: {m.min()}")
print(f"Maximum value: {m.max()}")
print(f"Minimum values along rows (dim=0):\n{m.min(dim=0)}")
Output:
Tensor m:
tensor([[1, 2, 3],
[4, 5, 6]])
Sum of all elements: 21
Sum along rows (dim=0):
tensor([5, 7, 9])
Sum along columns (dim=1): tensor([6, 15])
Mean of all elements: 3.5000
Mean along rows (dim=0):
tensor([2.5000, 3.5000, 4.5000])
Minimum value: 1
Maximum value: 6
Minimum values along rows (dim=0):
torch.return_types.min(
values=tensor([1, 2, 3]),
indices=tensor([0, 0, 0]))
Matrix Operations
PyTorch supports various matrix operations essential for deep learning.
# Matrix multiplication
p = torch.tensor([[1, 2], [3, 4]])
q = torch.tensor([[5, 6], [7, 8]])
print(f"Tensor p:\n{p}")
print(f"Tensor q:\n{q}")
# Matrix multiplication using matmul
mat_mul = torch.matmul(p, q)
print(f"Matrix multiplication (p @ q):\n{mat_mul}")
# Element-wise multiplication
element_mul = p * q
print(f"Element-wise multiplication (p * q):\n{element_mul}")
# Transpose
p_transpose = p.t()
print(f"Transpose of p:\n{p_transpose}")
# Dot product
vec1 = torch.tensor([1, 2, 3])
vec2 = torch.tensor([4, 5, 6])
dot_product = torch.dot(vec1, vec2)
print(f"Dot product of vectors: {dot_product}")
Output:
Tensor p:
tensor([[1, 2],
[3, 4]])
Tensor q:
tensor([[5, 6],
[7, 8]])
Matrix multiplication (p @ q):
tensor([[19, 22],
[43, 50]])
Element-wise multiplication (p * q):
tensor([[ 5, 12],
[21, 32]])
Transpose of p:
tensor([[1, 3],
[2, 4]])
Dot product of vectors: 32
Practical Examples
Let's explore some practical applications of tensor operations in deep learning contexts.
Example 1: Data Normalization
Normalizing input data is a common preprocessing step in deep learning.
# Create a batch of sample data
batch_data = torch.tensor([
[0.5, 1.0, 2.0],
[0.1, 0.8, 1.5],
[0.3, 1.2, 0.9]
])
print(f"Original batch data:\n{batch_data}")
# Compute mean and standard deviation along the batch dimension (dim=0)
batch_mean = batch_data.mean(dim=0)
batch_std = batch_data.std(dim=0)
print(f"Batch mean: {batch_mean}")
print(f"Batch standard deviation: {batch_std}")
# Normalize the data
normalized_data = (batch_data - batch_mean) / batch_std
print(f"Normalized data:\n{normalized_data}")
# Verify that the normalized data has mean ≈ 0 and std ≈ 1
print(f"Normalized mean: {normalized_data.mean(dim=0)}")
print(f"Normalized std: {normalized_data.std(dim=0)}")
Output:
Original batch data:
tensor([[0.5000, 1.0000, 2.0000],
[0.1000, 0.8000, 1.5000],
[0.3000, 1.2000, 0.9000]])
Batch mean: tensor([0.3000, 1.0000, 1.4667])
Batch standard deviation: tensor([0.2000, 0.2000, 0.5508])
Normalized data:
tensor([[ 1.0000, 0.0000, 0.9682],
[-1.0000, -1.0000, 0.0606],
[ 0.0000, 1.0000, -1.0288]])
Normalized mean: tensor([-0.0000, 0.0000, 0.0000])
Normalized std: tensor([1.0000, 1.0000, 1.0000])
Example 2: Computing Gradients with Tensor Operations
Let's simulate a simple gradient computation for a linear regression model.
# Create inputs and weights
x = torch.tensor([1.0, 2.0, 3.0, 4.0], requires_grad=True)
w = torch.tensor([0.5, 0.3, 0.2, 0.1], requires_grad=True)
b = torch.tensor([0.5], requires_grad=True)
# Forward pass
y_pred = torch.sum(w * x) + b
print(f"Prediction: {y_pred}")
# Define target and loss
y_target = torch.tensor([2.0])
loss = (y_pred - y_target) ** 2
print(f"Loss: {loss}")
# Backward pass (compute gradients)
loss.backward()
# Access gradients
print(f"Gradient of w: {w.grad}")
print(f"Gradient of x: {x.grad}")
print(f"Gradient of b: {b.grad}")
Output:
Prediction: tensor(2.4, grad_fn=<AddBackward0>)
Loss: tensor(0.16, grad_fn=<PowBackward0>)
Gradient of w: tensor([0.8000, 1.6000, 2.4000, 3.2000])
Gradient of x: tensor([0.4000, 0.2400, 0.1600, 0.0800])
Gradient of b: tensor([0.8000])
Example 3: Implementing a Simple Neural Network Layer
Let's implement a basic neural network layer using tensor operations.
# Define a simple fully connected layer
def linear_layer(inputs, weights, bias):
return torch.matmul(inputs, weights) + bias
# Create inputs, weights, and bias
batch_size = 3
input_size = 4
output_size = 2
inputs = torch.randn(batch_size, input_size) # Random input data
weights = torch.randn(input_size, output_size) # Random weights
bias = torch.randn(output_size) # Random bias
print(f"Inputs shape: {inputs.shape}")
print(f"Weights shape: {weights.shape}")
print(f"Bias shape: {bias.shape}")
# Apply the layer
outputs = linear_layer(inputs, weights, bias)
print(f"Layer output:\n{outputs}")
print(f"Output shape: {outputs.shape}")
# Apply activation function (ReLU)
activated = torch.relu(outputs)
print(f"After ReLU activation:\n{activated}")
Output (Note: actual values will vary due to random initialization):
Inputs shape: torch.Size([3, 4])
Weights shape: torch.Size([4, 2])
Bias shape: torch.Size([2])
Layer output:
tensor([[-0.1543, 1.2706],
[-1.7396, -0.0524],
[-1.0339, 0.2408]])
Output shape: torch.Size([3, 2])
After ReLU activation:
tensor([[0.0000, 1.2706],
[0.0000, 0.0000],
[0.0000, 0.2408]])
Summary
In this tutorial, we've covered the fundamental tensor operations in PyTorch:
- Basic Arithmetic Operations: Addition, subtraction, multiplication, division, and power operations
- In-place Operations: Memory-efficient modifications with trailing underscore methods
- Tensor Reshaping and Manipulation: Reshaping, squeezing/unsqueezing, concatenation, and stacking
- Indexing and Slicing: Accessing and manipulating tensor elements
- Mathematical Operations: Statistical functions and matrix operations
- Practical Examples: Data normalization, gradient computation, and implementing a neural network layer
These operations form the foundation of deep learning computations in PyTorch. By mastering these techniques, you'll be well-equipped to implement complex neural network architectures and training procedures.
Additional Resources
Exercises
- Create a 3x3 tensor of random numbers and normalize it to have a mean of 0 and a standard deviation of 1.
- Implement a function that takes a batch of images (represented as tensors) and performs a horizontal flip on each image.
- Write a function that computes the softmax activation of a given tensor along a specified dimension.
- Create a function that performs batch matrix multiplication between two sets of tensors.
- Implement a simple 2D convolution operation using only basic tensor operations (without using
torch.nn.functional.conv2d
).
By completing these exercises, you'll strengthen your understanding and practical skills with PyTorch tensor operations.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)