PyTorch Tensor Creation
Tensors are the fundamental building blocks of PyTorch and understanding how to create them is essential for any deep learning project. In this guide, we'll explore the various ways to create tensors in PyTorch, from simple initialization to converting data from different sources.
Introduction to PyTorch Tensors
Before diving into tensor creation, let's understand what tensors are. In PyTorch, tensors are multi-dimensional arrays similar to NumPy arrays but with additional features that make them suitable for deep learning:
- Tensors can be used on GPUs to accelerate numerical computations
- Tensors automatically track gradients for backpropagation
- Tensors integrate seamlessly with PyTorch's neural network library
Let's start by importing PyTorch:
import torch
Basic Tensor Creation Methods
Empty Tensors
To create a tensor with uninitialized values:
# Create an empty tensor of size 3x3
empty_tensor = torch.empty(3, 3)
print(empty_tensor)
Output:
tensor([[4.6536e-05, 0.0000e+00, 0.0000e+00],
[0.0000e+00, 0.0000e+00, 0.0000e+00],
[0.0000e+00, 0.0000e+00, 0.0000e+00]])
The values you see may differ as they are uninitialized memory values.
Zero and One Tensors
Creating tensors filled with zeros or ones:
# Create a tensor of zeros
zeros_tensor = torch.zeros(2, 3)
print(f"Zeros tensor:\n{zeros_tensor}")
# Create a tensor of ones
ones_tensor = torch.ones(2, 3)
print(f"Ones tensor:\n{ones_tensor}")
Output:
Zeros tensor:
tensor([[0., 0., 0.],
[0., 0., 0.]])
Ones tensor:
tensor([[1., 1., 1.],
[1., 1., 1.]])
Random Tensors
Generate tensors with random values:
# Random values between 0 and 1
random_tensor = torch.rand(2, 2)
print(f"Random tensor (0-1):\n{random_tensor}")
# Random values from a normal distribution (mean=0, std=1)
randn_tensor = torch.randn(2, 2)
print(f"Random normal tensor:\n{randn_tensor}")
Output:
Random tensor (0-1):
tensor([[0.8236, 0.5928],
[0.4565, 0.7828]])
Random normal tensor:
tensor([[ 0.1863, -1.5903],
[-0.0840, 0.4738]])
Creating Tensors with Specific Values
# Create a tensor with specific values
tensor_from_list = torch.tensor([1, 2, 3, 4])
print(f"Tensor from list: {tensor_from_list}")
# Create a 2D tensor from a nested list
matrix_tensor = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"Matrix tensor:\n{matrix_tensor}")
Output:
Tensor from list: tensor([1, 2, 3, 4])
Matrix tensor:
tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Range and Sequential Tensors
Create tensors with sequential values:
# Create a tensor with values from 0 to 9
range_tensor = torch.arange(10)
print(f"Range tensor: {range_tensor}")
# Create a tensor with 5 values evenly spaced between 0 and 1
linspace_tensor = torch.linspace(0, 1, 5)
print(f"Linspace tensor: {linspace_tensor}")
# Create a tensor with values from 1 to 9 with step 2
step_tensor = torch.arange(1, 10, 2)
print(f"Step tensor: {step_tensor}")
Output:
Range tensor: tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Linspace tensor: tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000])
Step tensor: tensor([1, 3, 5, 7, 9])
Specifying Data Types
PyTorch allows you to specify the data type (dtype) of a tensor:
# Integer tensor
int_tensor = torch.tensor([1, 2, 3], dtype=torch.int32)
print(f"Integer tensor: {int_tensor}, Type: {int_tensor.dtype}")
# Float tensor
float_tensor = torch.tensor([1, 2, 3], dtype=torch.float32)
print(f"Float tensor: {float_tensor}, Type: {float_tensor.dtype}")
# Double precision tensor
double_tensor = torch.tensor([1, 2, 3], dtype=torch.float64)
print(f"Double tensor: {double_tensor}, Type: {double_tensor.dtype}")
# Boolean tensor
bool_tensor = torch.tensor([True, False, True], dtype=torch.bool)
print(f"Boolean tensor: {bool_tensor}, Type: {bool_tensor.dtype}")
Output:
Integer tensor: tensor([1, 2, 3], dtype=torch.int32), Type: torch.int32
Float tensor: tensor([1., 2., 3.]), Type: torch.float32
Double tensor: tensor([1., 2., 3.], dtype=torch.float64), Type: torch.float64
Boolean tensor: tensor([ True, False, True]), Type: torch.bool
Converting from Other Data Types
From NumPy Arrays
NumPy arrays can be directly converted to PyTorch tensors:
import numpy as np
# Create a NumPy array
numpy_array = np.array([[1, 2], [3, 4]])
print(f"NumPy array:\n{numpy_array}")
# Convert to PyTorch tensor
tensor_from_numpy = torch.from_numpy(numpy_array)
print(f"Tensor from NumPy:\n{tensor_from_numpy}")
# Note: These tensors share the same memory!
numpy_array[0, 0] = 100
print(f"Modified NumPy array:\n{numpy_array}")
print(f"Tensor after NumPy modification:\n{tensor_from_numpy}")
Output:
NumPy array:
[[1 2]
[3 4]]
Tensor from NumPy:
tensor([[1, 2],
[3, 4]])
Modified NumPy array:
[[100 2]
[ 3 4]]
Tensor after NumPy modification:
tensor([[100, 2],
[ 3, 4]])
From Python Lists and Other Iterables
# From a list
list_tensor = torch.tensor([1, 2, 3, 4, 5])
print(f"Tensor from list: {list_tensor}")
# From a tuple
tuple_tensor = torch.tensor((5, 4, 3, 2, 1))
print(f"Tensor from tuple: {tuple_tensor}")
Output:
Tensor from list: tensor([1, 2, 3, 4, 5])
Tensor from tuple: tensor([5, 4, 3, 2, 1])
Tensor Shape Manipulation During Creation
Creating tensors with specific shapes
# Create a 3x3 identity matrix
identity_matrix = torch.eye(3)
print(f"Identity matrix:\n{identity_matrix}")
# Reshape an existing tensor
range_tensor = torch.arange(9)
reshaped_tensor = range_tensor.view(3, 3)
print(f"Original tensor: {range_tensor}")
print(f"Reshaped tensor (3x3):\n{reshaped_tensor}")
# Alternative approach using reshape
reshaped_tensor_2 = range_tensor.reshape(3, 3)
print(f"Reshaped tensor using reshape():\n{reshaped_tensor_2}")
Output:
Identity matrix:
tensor([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
Original tensor: tensor([0, 1, 2, 3, 4, 5, 6, 7, 8])
Reshaped tensor (3x3):
tensor([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
Reshaped tensor using reshape():
tensor([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
Creating Tensors Like Other Tensors
PyTorch provides functions to create tensors with the same size as other tensors:
# Create a template tensor
template_tensor = torch.tensor([[1, 2, 3], [4, 5, 6]])
# Create zeros with same shape
zeros_like_template = torch.zeros_like(template_tensor)
print(f"Zeros like template:\n{zeros_like_template}")
# Create ones with same shape
ones_like_template = torch.ones_like(template_tensor)
print(f"Ones like template:\n{ones_like_template}")
# Create random tensor with same shape
rand_like_template = torch.rand_like(template_tensor.float()) # Need to convert to float for rand_like
print(f"Random tensor like template:\n{rand_like_template}")
Output:
Zeros like template:
tensor([[0, 0, 0],
[0, 0, 0]])
Ones like template:
tensor([[1, 1, 1],
[1, 1, 1]])
Random tensor like template:
tensor([[0.3672, 0.1426, 0.5862],
[0.7351, 0.1170, 0.4162]])
Real-World Applications
Let's explore a few practical examples where tensor creation is important:
Example 1: Creating a Feature Matrix for Machine Learning
# Creating a feature matrix for a machine learning model
# Let's say we have 5 data points, each with 3 features
features = torch.tensor([
[1.2, 4.5, 3.1], # Data point 1
[0.8, 3.2, 2.7], # Data point 2
[1.5, 5.0, 0.9], # Data point 3
[2.3, 1.1, 4.8], # Data point 4
[0.2, 4.4, 1.6] # Data point 5
])
# Labels for our data points (binary classification)
labels = torch.tensor([0, 1, 0, 1, 0])
print(f"Feature matrix shape: {features.shape}")
print(f"Labels shape: {labels.shape}")
print(f"First data point features: {features[0]}")
Output:
Feature matrix shape: torch.Size([5, 3])
Labels shape: torch.Size([5])
First data point features: tensor([1.2000, 4.5000, 3.1000])
Example 2: Creating an Image Tensor for Computer Vision
# Creating a batch of grayscale images (tensor shape: [batch_size, height, width])
batch_size = 3
height = 28
width = 28
# Create a batch of random grayscale images
gray_images = torch.rand(batch_size, height, width)
print(f"Grayscale images tensor shape: {gray_images.shape}")
# For colored images, we add a channel dimension [batch_size, channels, height, width]
colored_images = torch.rand(batch_size, 3, height, width) # 3 channels for RGB
print(f"Colored images tensor shape: {colored_images.shape}")
# Access the first image in the batch
first_image = colored_images[0]
print(f"First image shape: {first_image.shape}")
# Access the red channel of the first image
red_channel = colored_images[0, 0]
print(f"Red channel shape: {red_channel.shape}")
Output:
Grayscale images tensor shape: torch.Size([3, 28, 28])
Colored images tensor shape: torch.Size([3, 3, 28, 28])
First image shape: torch.Size([3, 28, 28])
Red channel shape: torch.Size([28, 28])
Example 3: Creating Embedding Vectors for NLP
# Creating word embedding vectors for a small vocabulary
vocab_size = 10 # Number of words in our vocabulary
embedding_dim = 5 # Dimensionality of our embeddings
# Random initialization of embedding vectors
word_embeddings = torch.randn(vocab_size, embedding_dim)
print(f"Word embedding matrix shape: {word_embeddings.shape}")
print("Embedding vector for word at index 3:")
print(word_embeddings[3])
# Creating a one-hot encoded tensor for words
one_hot = torch.zeros(vocab_size, vocab_size)
for i in range(vocab_size):
one_hot[i, i] = 1
print("\nOne-hot encoded vectors:")
print(one_hot)
Output:
Word embedding matrix shape: torch.Size([10, 5])
Embedding vector for word at index 3:
tensor([-0.5987, 0.1920, -0.8355, -0.6013, 1.2724])
One-hot encoded vectors:
tensor([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])
Specifying Device for Tensor Creation
PyTorch allows you to create tensors directly on a specific device (CPU or GPU):
# Create a tensor on CPU (default)
cpu_tensor = torch.tensor([1, 2, 3])
print(f"CPU tensor device: {cpu_tensor.device}")
# Create a tensor on GPU if available
if torch.cuda.is_available():
gpu_tensor = torch.tensor([1, 2, 3], device='cuda')
print(f"GPU tensor device: {gpu_tensor.device}")
else:
print("CUDA is not available, cannot create GPU tensor")
# Alternative way to specify device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
tensor_on_device = torch.tensor([1, 2, 3], device=device)
print(f"Tensor created on: {tensor_on_device.device}")
Output (may vary depending on your system):
CPU tensor device: cpu
CUDA is not available, cannot create GPU tensor
Tensor created on: cpu
Summary
In this guide, we've explored the various methods of creating tensors in PyTorch:
- Basic tensor creation methods:
torch.empty()
,torch.zeros()
,torch.ones()
,torch.rand()
- Creating tensors with specific values using
torch.tensor()
- Sequential tensors with
torch.arange()
andtorch.linspace()
- Specifying data types for tensors
- Converting from other data types like NumPy arrays
- Reshaping tensors during creation
- Creating tensors with shapes similar to other tensors
- Real-world applications of tensor creation in machine learning, computer vision, and NLP
- Specifying devices for tensor creation (CPU or GPU)
Understanding these tensor creation methods is fundamental to working effectively with PyTorch for deep learning and machine learning tasks.
Exercises
- Create a 4x4 matrix of random integers between 1 and 10.
- Create a tensor representing a batch of 5 RGB images of size 32x32.
- Convert a list of lists into a PyTorch tensor and change its data type to float32.
- Create a diagonal matrix with values [3, 5, 7, 9] on the diagonal.
- Create a tensor on GPU (if available) and move it back to CPU.
Additional Resources
- PyTorch Official Documentation on Tensors
- PyTorch Cheat Sheet for Tensor Operations
- Deep Learning with PyTorch: A 60 Minute Blitz
Keep exploring PyTorch tensors as they form the foundation of all operations in the PyTorch ecosystem!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)