Git LFS (Large File Storage)
Introduction
When working with Git repositories, you might encounter situations where you need to track large files like images, audio files, datasets, or other binary files. Git wasn't originally designed to handle such files efficiently, which can lead to bloated repositories and slow performance.
Git Large File Storage (LFS) is an extension to Git that addresses this limitation by replacing large files with text pointers in the Git repository, while storing the actual file contents on a remote server. This approach significantly improves repository performance when working with large files.
Why Use Git LFS?
Traditional Git stores the complete history of every file, including all versions of binary files. This can cause several problems:
- Repository bloat: The
.git
directory grows excessively large - Slow cloning and fetching: Downloading the entire history of large files takes time
- Wasted bandwidth and storage: Developers download all versions of all files, even if they only need the latest
- Performance degradation: Git operations become slower as the repository grows
Git LFS solves these issues by handling large files differently from regular code files.
How Git LFS Works
When you use Git LFS:
- Large files are stored on a separate LFS server
- Small pointer files (under 1KB) remain in your Git repository
- Git LFS client intercepts Git commands to handle LFS objects appropriately
- When you clone or pull, the LFS client downloads the actual files from the LFS server
Getting Started with Git LFS
Installation
First, you need to install Git LFS:
For macOS (using Homebrew):
brew install git-lfs
For Windows:
# Using Chocolatey
choco install git-lfs
# Or download and install from https://git-lfs.github.com/
For Linux (Ubuntu/Debian):
sudo apt-get install git-lfs
After installation, you need to set up Git LFS:
git lfs install
This command sets up the necessary Git hooks in your global Git configuration.
Basic Usage
Step 1: Initialize a Git Repository
If you don't already have a repository, create one:
mkdir my-project
cd my-project
git init
Step 2: Track Files with Git LFS
Tell Git LFS which file types to manage:
# Track all .psd files
git lfs track "*.psd"
# Track all .zip files
git lfs track "*.zip"
# Track a specific file
git lfs track "large-file.data"
These commands create or modify a .gitattributes
file, which you should commit:
git add .gitattributes
git commit -m "Configure Git LFS tracking"
The .gitattributes
file will contain entries like:
*.psd filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
large-file.data filter=lfs diff=lfs merge=lfs -text
Step 3: Add and Commit Files as Usual
Now you can add and commit large files as you normally would:
# Add a large PSD file
git add design.psd
git commit -m "Add design file"
# Push to remote repository
git push origin main
Behind the scenes, Git LFS:
- Replaces the actual file with a pointer file in the Git repository
- Stores the file content in the LFS cache
- Uploads the file content to the LFS server when you push
Managing LFS Files
Viewing Tracked Files
To see which files are being tracked by Git LFS:
git lfs ls-files
Example output:
89ab2c1d12 * design.psd
76fe43b221 * images/background.png
Fetching LFS Files
When you clone a repository with LFS files:
git clone https://github.com/username/repo.git
By default, Git LFS downloads all LFS objects. If you want to clone without downloading LFS content:
GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/username/repo.git
Later, you can fetch specific LFS files:
git lfs pull --include="*.psd" --exclude="*"
LFS File Locking
Git LFS supports file locking to prevent conflicts on binary files that can't be merged:
# Lock a file to prevent others from editing it
git lfs lock design.psd
# Unlock when finished
git lfs unlock design.psd
# List all locked files
git lfs locks
Real-World Examples
Example 1: Managing Design Assets in a Web Project
Imagine you're working on a web project with both code and design assets:
my-website/
├── src/
│ ├── index.js
│ └── components/
├── public/
│ ├── images/
│ │ ├── hero.jpg # 8MB image
│ │ └── gallery/ # Multiple high-res images
│ └── fonts/
└── design/
├── mockups.psd # 200MB Photoshop file
└── ui-kit.sketch # 150MB Sketch file
Configure Git LFS to track the large binary files:
git lfs track "*.jpg"
git lfs track "*.png"
git lfs track "*.psd"
git lfs track "*.sketch"
git add .gitattributes
git commit -m "Configure Git LFS for design assets"
Now you can commit and collaborate on both code and large files efficiently.
Example 2: Game Development with Assets
In game development, you often have code, textures, models, and sound files:
my-game/
├── src/
│ ├── main.cpp
│ └── engine/
├── assets/
│ ├── models/ # 3D models
│ ├── textures/ # High-resolution textures
│ └── audio/ # Music and sound effects
└── builds/ # Compiled game versions
Configure Git LFS:
git lfs track "assets/models/**"
git lfs track "assets/textures/**"
git lfs track "assets/audio/**"
git lfs track "builds/**"
git add .gitattributes
git commit -m "Set up Git LFS for game assets"
This setup keeps your repository efficient while still tracking all necessary assets.
Best Practices
- Track selectively: Only use Git LFS for files that truly need it (typically binary files over 5MB)
- Commit
.gitattributes
first: Always commit this file before adding any LFS-tracked files - Use patterns wisely: Be specific with tracking patterns to avoid unnecessary LFS usage
- Consider bandwidth limitations: Be aware that LFS might use more bandwidth during pushes and pulls
- Set up file locking: Use locking for binary files that can't be merged
- Check hosting limits: Be aware of storage and bandwidth limits on your Git hosting provider
Migrating Existing Repositories to Git LFS
If you already have large files in your repository, you can migrate them to Git LFS:
# Install git-lfs-migrate tool
git lfs install
# Find large files in your repository
git lfs migrate info --above=10MB
# Migrate files to LFS
git lfs migrate import --include="*.psd,*.zip" --above=10MB
Warning: This rewrites history, so coordinate with your team before performing migration.
Troubleshooting Common Issues
Issue: "Encountered X file(s) that should have been pointers, but weren't"
This typically happens when someone commits a large file without Git LFS configured.
Solution:
git lfs fetch --all
git lfs checkout
Issue: "batch request: not found"
This usually means the LFS server is not properly configured.
Solution: Check your remote URLs and LFS endpoint configuration:
git lfs env
Issue: "Error downloading object"
This might happen due to network issues or server problems.
Solution: Retry with:
git lfs pull --include="path/to/file"
Summary
Git LFS is a powerful extension that enables efficient management of large files in Git repositories. By storing pointers in the repository and actual file contents on a separate server, Git LFS maintains Git's performance while supporting collaboration on projects with large assets.
Key benefits:
- Smaller repository size
- Faster cloning and fetching
- Efficient handling of binary files
- Better performance for all Git operations
When to use Git LFS:
- Projects with large binary assets (images, audio, video)
- Repositories with frequently changing large files
- Teams working with design files, datasets, or other large content
Additional Resources
For more information about Git LFS, check out:
Exercises
- Initialize Git LFS in a new repository and configure it to track PNG and PSD files.
- Add several large files to your repository and examine the
.git
directory size compared to the actual files. - Clone your repository on another machine and observe how Git LFS handles the large files.
- Try locking and unlocking a file, then have a team member attempt to modify it.
- Convert an existing repository with large files to use Git LFS.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)