PyTorch Tensor Operations

Introduction

Tensors are the fundamental data structures in PyTorch, similar to NumPy arrays but with additional capabilities for GPU acceleration and automatic differentiation. Being able to manipulate tensors effectively is crucial for building and training neural networks. In this tutorial, we'll explore the various operations you can perform on PyTorch tensors, from basic arithmetic to more complex transformations.

Basic Tensor Operations

Let's start by importing PyTorch and creating some tensors to work with.

python
import torch

# Create some tensors
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])
print(f"Tensor a: {a}")
print(f"Tensor b: {b}")

Output:

Tensor a: tensor([1, 2, 3])
Tensor b: tensor([4, 5, 6])

Arithmetic Operations

PyTorch supports all standard arithmetic operations on tensors.

python
# Addition
c = a + b  # Equivalent to torch.add(a, b)
print(f"a + b = {c}")

# Subtraction
d = a - b  # Equivalent to torch.sub(a, b)
print(f"a - b = {d}")

# Multiplication (element-wise)
e = a * b  # Equivalent to torch.mul(a, b)
print(f"a * b = {e}")

# Division (element-wise)
f = a / b  # Equivalent to torch.div(a, b)
print(f"a / b = {f}")

# Power
g = a ** 2  # Equivalent to torch.pow(a, 2)
print(f"a^2 = {g}")

Output:

a + b = tensor([5, 7, 9])
a - b = tensor([-3, -3, -3])
a * b = tensor([4, 10, 18])
a / b = tensor([0.2500, 0.4000, 0.5000])
a^2 = tensor([1, 4, 9])

In-place Operations

PyTorch allows for in-place operations that modify the tensor directly, which can be more memory-efficient. These operations are denoted with a trailing underscore.

python
# Create a copy of 'a' to preserve the original
a_copy = a.clone()
print(f"Original a_copy: {a_copy}")

# In-place addition
a_copy.add_(b)
print(f"After a_copy.add_(b): {a_copy}")

# Reset
a_copy = a.clone()
# In-place multiplication
a_copy.mul_(2)
print(f"After a_copy.mul_(2): {a_copy}")

Output:

Original a_copy: tensor([1, 2, 3])
After a_copy.add_(b): tensor([5, 7, 9])
After a_copy.mul_(2): tensor([2, 4, 6])

Tensor Reshaping and Manipulation

Reshaping Tensors

You can change the shape of a tensor using various methods:

python
# Create a tensor
h = torch.tensor([1, 2, 3, 4, 5, 6])
print(f"Original tensor h: {h}, shape: {h.shape}")

# Reshape using .view() method
h_reshaped = h.view(2, 3)
print(f"Reshaped h: {h_reshaped}, shape: {h_reshaped.shape}")

# Reshape using .reshape() method
h_reshaped2 = h.reshape(3, 2)
print(f"Reshaped h (alternate): {h_reshaped2}, shape: {h_reshaped2.shape}")

# Using -1 for automatic dimension calculation
h_reshaped3 = h.view(-1, 2)  # Automatically determines the first dimension
print(f"Auto-reshaped h: {h_reshaped3}, shape: {h_reshaped3.shape}")

Output:

Original tensor h: tensor([1, 2, 3, 4, 5, 6]), shape: torch.Size([6])
Reshaped h: tensor([[1, 2, 3],
                   [4, 5, 6]]), shape: torch.Size([2, 3])
Reshaped h (alternate): tensor([[1, 2],
                               [3, 4],
                               [5, 6]]), shape: torch.Size([3, 2])
Auto-reshaped h: tensor([[1, 2],
                        [3, 4],
                        [5, 6]]), shape: torch.Size([3, 2])

Note: The main difference between view() and reshape() is that view() returns a tensor that shares the same underlying data with the original tensor, while reshape() may return a copy if necessary.

Squeezing and Unsqueezing

These operations are used to add or remove dimensions of size 1.

python
# Create a tensor with an extra dimension
i = torch.tensor([[1, 2, 3]])
print(f"Original tensor i: {i}, shape: {i.shape}")

# Remove dimensions of size 1
i_squeezed = i.squeeze()
print(f"Squeezed i: {i_squeezed}, shape: {i_squeezed.shape}")

# Add a dimension
j = torch.tensor([1, 2, 3])
j_unsqueezed = j.unsqueeze(0)  # Add dimension at index 0
print(f"Original j: {j}, shape: {j.shape}")
print(f"Unsqueezed j: {j_unsqueezed}, shape: {j_unsqueezed.shape}")

# Add dimension at a different position
j_unsqueezed1 = j.unsqueeze(1)  # Add dimension at index 1
print(f"j unsqueezed at dim 1: {j_unsqueezed1}, shape: {j_unsqueezed1.shape}")

Output:

Original tensor i: tensor([[1, 2, 3]]), shape: torch.Size([1, 3])
Squeezed i: tensor([1, 2, 3]), shape: torch.Size([3])
Original j: tensor([1, 2, 3]), shape: torch.Size([3])
Unsqueezed j: tensor([[1, 2, 3]]), shape: torch.Size([1, 3])
j unsqueezed at dim 1: tensor([[1], [2], [3]]), shape: torch.Size([3, 1])

Concatenation and Stacking

You can combine multiple tensors using concatenation or stacking operations.

python
# Concatenate tensors along a dimension
tensor1 = torch.tensor([[1, 2], [3, 4]])
tensor2 = torch.tensor([[5, 6], [7, 8]])

cat_dim0 = torch.cat((tensor1, tensor2), dim=0)
cat_dim1 = torch.cat((tensor1, tensor2), dim=1)

print(f"Tensor 1:\n{tensor1}")
print(f"Tensor 2:\n{tensor2}")
print(f"Concatenated along dim 0:\n{cat_dim0}")
print(f"Concatenated along dim 1:\n{cat_dim1}")

# Stack tensors to create a new dimension
stacked = torch.stack([tensor1, tensor2])
print(f"Stacked tensors:\n{stacked}")
print(f"Stacked shape: {stacked.shape}")

Output:

Tensor 1:
tensor([[1, 2],
        [3, 4]])
Tensor 2:
tensor([[5, 6],
        [7, 8]])
Concatenated along dim 0:
tensor([[1, 2],
        [3, 4],
        [5, 6],
        [7, 8]])
Concatenated along dim 1:
tensor([[1, 2, 5, 6],
        [3, 4, 7, 8]])
Stacked tensors:
tensor([[[1, 2],
         [3, 4]],

        [[5, 6],
         [7, 8]]])
Stacked shape: torch.Size([2, 2, 2])

Tensor Indexing and Slicing

PyTorch tensors support similar indexing and slicing operations as NumPy arrays.

python
# Create a tensor
k = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"Original tensor k:\n{k}")

# Accessing elements
print(f"First row: {k[0]}")
print(f"Element at position (1, 2): {k[1, 2]}")

# Slicing
print(f"First two rows:\n{k[:2]}")
print(f"Last two columns:\n{k[:, 1:]}")
print(f"Sub-matrix (top-left 2x2):\n{k[:2, :2]}")

Output:

Original tensor k:
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
First row: tensor([1, 2, 3])
Element at position (1, 2): 6
First two rows:
tensor([[1, 2, 3],
        [4, 5, 6]])
Last two columns:
tensor([[2, 3],
        [5, 6],
        [8, 9]])
Sub-matrix (top-left 2x2):
tensor([[1, 2],
        [4, 5]])

Advanced Indexing

PyTorch also supports boolean and integer array indexing.

python
# Boolean indexing
mask = k > 5
print(f"Mask where k > 5:\n{mask}")
print(f"Elements where k > 5: {k[mask]}")

# Integer array indexing
indices = torch.tensor([0, 2])
print(f"First and third rows:\n{k[indices]}")

Output:

Mask where k > 5:
tensor([[False, False, False],
        [False, False, True],
        [True, True, True]])
Elements where k > 5: tensor([6, 7, 8, 9])
First and third rows:
tensor([[1, 2, 3],
        [7, 8, 9]])

Mathematical Operations

PyTorch provides a variety of mathematical functions for tensor operations.

Statistical Functions

python
# Create a tensor
m = torch.tensor([[1, 2, 3], [4, 5, 6]])
print(f"Tensor m:\n{m}")

# Sum
print(f"Sum of all elements: {m.sum()}")
print(f"Sum along rows (dim=0):\n{m.sum(dim=0)}")
print(f"Sum along columns (dim=1): {m.sum(dim=1)}")

# Mean
print(f"Mean of all elements: {m.mean()}")
print(f"Mean along rows (dim=0):\n{m.mean(dim=0)}")

# Min and Max
print(f"Minimum value: {m.min()}")
print(f"Maximum value: {m.max()}")
print(f"Minimum values along rows (dim=0):\n{m.min(dim=0)}")

Output:

Tensor m:
tensor([[1, 2, 3],
        [4, 5, 6]])
Sum of all elements: 21
Sum along rows (dim=0):
tensor([5, 7, 9])
Sum along columns (dim=1): tensor([6, 15])
Mean of all elements: 3.5000
Mean along rows (dim=0):
tensor([2.5000, 3.5000, 4.5000])
Minimum value: 1
Maximum value: 6
Minimum values along rows (dim=0):
torch.return_types.min(
values=tensor([1, 2, 3]),
indices=tensor([0, 0, 0]))

Matrix Operations

PyTorch supports various matrix operations essential for deep learning.

python
# Matrix multiplication
p = torch.tensor([[1, 2], [3, 4]])
q = torch.tensor([[5, 6], [7, 8]])
print(f"Tensor p:\n{p}")
print(f"Tensor q:\n{q}")

# Matrix multiplication using matmul
mat_mul = torch.matmul(p, q)
print(f"Matrix multiplication (p @ q):\n{mat_mul}")

# Element-wise multiplication
element_mul = p * q
print(f"Element-wise multiplication (p * q):\n{element_mul}")

# Transpose
p_transpose = p.t()
print(f"Transpose of p:\n{p_transpose}")

# Dot product
vec1 = torch.tensor([1, 2, 3])
vec2 = torch.tensor([4, 5, 6])
dot_product = torch.dot(vec1, vec2)
print(f"Dot product of vectors: {dot_product}")

Output:

Tensor p:
tensor([[1, 2],
        [3, 4]])
Tensor q:
tensor([[5, 6],
        [7, 8]])
Matrix multiplication (p @ q):
tensor([[19, 22],
        [43, 50]])
Element-wise multiplication (p * q):
tensor([[ 5, 12],
        [21, 32]])
Transpose of p:
tensor([[1, 3],
        [2, 4]])
Dot product of vectors: 32

Practical Examples

Let's explore some practical applications of tensor operations in deep learning contexts.

Example 1: Data Normalization

Normalizing input data is a common preprocessing step in deep learning.

python
# Create a batch of sample data
batch_data = torch.tensor([
    [0.5, 1.0, 2.0],
    [0.1, 0.8, 1.5],
    [0.3, 1.2, 0.9]
])
print(f"Original batch data:\n{batch_data}")

# Compute mean and standard deviation along the batch dimension (dim=0)
batch_mean = batch_data.mean(dim=0)
batch_std = batch_data.std(dim=0)

print(f"Batch mean: {batch_mean}")
print(f"Batch standard deviation: {batch_std}")

# Normalize the data
normalized_data = (batch_data - batch_mean) / batch_std
print(f"Normalized data:\n{normalized_data}")

# Verify that the normalized data has mean ≈ 0 and std ≈ 1
print(f"Normalized mean: {normalized_data.mean(dim=0)}")
print(f"Normalized std: {normalized_data.std(dim=0)}")

Output:

Original batch data:
tensor([[0.5000, 1.0000, 2.0000],
        [0.1000, 0.8000, 1.5000],
        [0.3000, 1.2000, 0.9000]])
Batch mean: tensor([0.3000, 1.0000, 1.4667])
Batch standard deviation: tensor([0.2000, 0.2000, 0.5508])
Normalized data:
tensor([[ 1.0000,  0.0000,  0.9682],
        [-1.0000, -1.0000,  0.0606],
        [ 0.0000,  1.0000, -1.0288]])
Normalized mean: tensor([-0.0000, 0.0000, 0.0000])
Normalized std: tensor([1.0000, 1.0000, 1.0000])

Example 2: Computing Gradients with Tensor Operations

Let's simulate a simple gradient computation for a linear regression model.

python
# Create inputs and weights
x = torch.tensor([1.0, 2.0, 3.0, 4.0], requires_grad=True)
w = torch.tensor([0.5, 0.3, 0.2, 0.1], requires_grad=True)
b = torch.tensor([0.5], requires_grad=True)

# Forward pass
y_pred = torch.sum(w * x) + b
print(f"Prediction: {y_pred}")

# Define target and loss
y_target = torch.tensor([2.0])
loss = (y_pred - y_target) ** 2
print(f"Loss: {loss}")

# Backward pass (compute gradients)
loss.backward()

# Access gradients
print(f"Gradient of w: {w.grad}")
print(f"Gradient of x: {x.grad}")
print(f"Gradient of b: {b.grad}")

Output:

Prediction: tensor(2.4, grad_fn=<AddBackward0>)
Loss: tensor(0.16, grad_fn=<PowBackward0>)
Gradient of w: tensor([0.8000, 1.6000, 2.4000, 3.2000])
Gradient of x: tensor([0.4000, 0.2400, 0.1600, 0.0800])
Gradient of b: tensor([0.8000])

Example 3: Implementing a Simple Neural Network Layer

Let's implement a basic neural network layer using tensor operations.

python
# Define a simple fully connected layer
def linear_layer(inputs, weights, bias):
    return torch.matmul(inputs, weights) + bias

# Create inputs, weights, and bias
batch_size = 3
input_size = 4
output_size = 2

inputs = torch.randn(batch_size, input_size)  # Random input data
weights = torch.randn(input_size, output_size)  # Random weights
bias = torch.randn(output_size)  # Random bias

print(f"Inputs shape: {inputs.shape}")
print(f"Weights shape: {weights.shape}")
print(f"Bias shape: {bias.shape}")

# Apply the layer
outputs = linear_layer(inputs, weights, bias)
print(f"Layer output:\n{outputs}")
print(f"Output shape: {outputs.shape}")

# Apply activation function (ReLU)
activated = torch.relu(outputs)
print(f"After ReLU activation:\n{activated}")

Output (Note: actual values will vary due to random initialization):

Inputs shape: torch.Size([3, 4])
Weights shape: torch.Size([4, 2])
Bias shape: torch.Size([2])
Layer output:
tensor([[-0.1543,  1.2706],
        [-1.7396, -0.0524],
        [-1.0339,  0.2408]])
Output shape: torch.Size([3, 2])
After ReLU activation:
tensor([[0.0000, 1.2706],
        [0.0000, 0.0000],
        [0.0000, 0.2408]])

Summary

In this tutorial, we've covered the fundamental tensor operations in PyTorch:

Basic Arithmetic Operations: Addition, subtraction, multiplication, division, and power operations
In-place Operations: Memory-efficient modifications with trailing underscore methods
Tensor Reshaping and Manipulation: Reshaping, squeezing/unsqueezing, concatenation, and stacking
Indexing and Slicing: Accessing and manipulating tensor elements
Mathematical Operations: Statistical functions and matrix operations
Practical Examples: Data normalization, gradient computation, and implementing a neural network layer

These operations form the foundation of deep learning computations in PyTorch. By mastering these techniques, you'll be well-equipped to implement complex neural network architectures and training procedures.

Additional Resources

Exercises

Create a 3x3 tensor of random numbers and normalize it to have a mean of 0 and a standard deviation of 1.
Implement a function that takes a batch of images (represented as tensors) and performs a horizontal flip on each image.
Write a function that computes the softmax activation of a given tensor along a specified dimension.
Create a function that performs batch matrix multiplication between two sets of tensors.
Implement a simple 2D convolution operation using only basic tensor operations (without using torch.nn.functional.conv2d).

By completing these exercises, you'll strengthen your understanding and practical skills with PyTorch tensor operations.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Basic Tensor Operations​

Arithmetic Operations​

In-place Operations​

Tensor Reshaping and Manipulation​

Reshaping Tensors​

Squeezing and Unsqueezing​

Concatenation and Stacking​

Tensor Indexing and Slicing​

Advanced Indexing​

Mathematical Operations​

Statistical Functions​

Matrix Operations​

Practical Examples​

Example 1: Data Normalization​

Example 2: Computing Gradients with Tensor Operations​

Example 3: Implementing a Simple Neural Network Layer​

Summary​

Additional Resources​

Exercises​

Introduction

Basic Tensor Operations

Arithmetic Operations

In-place Operations

Tensor Reshaping and Manipulation

Reshaping Tensors

Squeezing and Unsqueezing

Concatenation and Stacking

Tensor Indexing and Slicing

Advanced Indexing

Mathematical Operations

Statistical Functions

Matrix Operations

Practical Examples

Example 1: Data Normalization

Example 2: Computing Gradients with Tensor Operations

Example 3: Implementing a Simple Neural Network Layer

Summary

Additional Resources

Exercises