Ubuntu Piping
Introduction
Piping is one of the most powerful concepts in the Ubuntu terminal (and Linux systems in general). It allows you to connect multiple commands together, where the output of one command becomes the input to the next. This concept embodies the Unix philosophy of creating small, focused tools that can be combined to solve complex problems.
The pipe operator in Ubuntu is represented by the vertical bar symbol (|
). When you see this symbol between commands, it means "take the output of the command on the left and feed it as input to the command on the right."
Basic Syntax
The basic syntax for piping is:
command1 | command2
This means: run command1
, take its output, and use it as input for command2
.
Simple Examples
Let's start with some simple examples to understand the concept:
Example 1: Counting Files in a Directory
ls | wc -l
Input:
$ ls | wc -l
Output:
42
Explanation:
ls
lists all files and directories in the current directorywc -l
counts the number of lines in the input- Together, this command tells you how many files and directories are in your current directory
Example 2: Finding a Specific File
ls | grep ".txt"
Input:
$ ls | grep ".txt"
Output:
notes.txt
report.txt
todo.txt
Explanation:
ls
lists all files and directoriesgrep ".txt"
filters the input, showing only lines containing ".txt"- The result is a list of only the text files in your directory
Chaining Multiple Pipes
You can chain multiple commands together using pipes, creating a pipeline of operations:
command1 | command2 | command3
Example: Finding the 5 Largest Files
du -h /home/user | sort -rh | head -5
Input:
$ du -h /home/user | sort -rh | head -5
Output:
1.2G /home/user
800M /home/user/Videos
500M /home/user/Downloads
200M /home/user/Documents
150M /home/user/Pictures
Explanation:
du -h /home/user
lists disk usage of files and directories in human-readable formatsort -rh
sorts the output in reverse order (largest first) and interprets human-readable sizeshead -5
shows only the first 5 lines of output- The result is a list of the 5 largest directories or files in your home directory
Common Piping Patterns
Filtering Output
One of the most common uses of piping is filtering output:
command | grep "pattern"
Example: Finding Running Python Processes
ps aux | grep "python"
Input:
$ ps aux | grep "python"
Output:
user 1234 0.5 1.2 123456 12345 ? S 10:30 0:25 python3 app.py
user 2345 0.0 0.3 45678 3456 pts/0 S+ 11:42 0:00 grep --color=auto python
Explanation:
ps aux
lists all running processesgrep "python"
filters to show only processes with "python" in their description
Counting and Summarizing
Piping is excellent for quick statistics:
command | wc -l # Count lines
command | sort | uniq # Find unique values
Example: Counting Unique Users Logged In
who | cut -d' ' -f1 | sort | uniq | wc -l
Input:
$ who | cut -d' ' -f1 | sort | uniq | wc -l
Output:
3
Explanation:
who
shows who is logged incut -d' ' -f1
extracts just the username fieldsort
sorts the usernamesuniq
removes duplicate usernameswc -l
counts the number of unique usernames- The result tells you how many different users are currently logged in
Processing and Transforming Data
Pipes excel at data transformation:
command | sed 's/old/new/g' # Replace text
command | awk '{print $2}' # Extract specific fields
Example: Extracting the Second Column from a CSV File
cat data.csv | cut -d',' -f2
Input:
$ cat data.csv | cut -d',' -f2
Output:
Name
John
Sarah
Michael
Explanation:
cat data.csv
outputs the contents of the CSV filecut -d',' -f2
extracts just the second column, using comma as a delimiter
Real-World Applications
Application 1: Log Analysis
Finding error messages in a large log file:
cat /var/log/syslog | grep "ERROR" | tail -10
Explanation: This pipeline finds the last 10 error messages in the system log.
Application 2: System Monitoring
Monitoring top memory-consuming processes:
ps aux | sort -nrk 4 | head -5
Explanation: This shows the 5 processes using the most memory (sorted by the 4th column of ps output).
Application 3: File Processing
Counting word frequency in a document:
cat document.txt | tr -s ' ' '
' | sort | uniq -c | sort -nr | head -10
Explanation: This pipeline counts and displays the 10 most frequent words in a document:
cat document.txt
outputs the documenttr -s ' ' ' '
replaces spaces with newlines, putting each word on its own linesort
sorts all words alphabeticallyuniq -c
counts occurrences of each wordsort -nr
sorts numerically in reverse order (highest count first)head -10
shows only the top 10 results
Advanced Piping Concepts
Named Pipes (FIFOs)
Beyond the basic pipe operator, Ubuntu supports named pipes (FIFOs) for more complex scenarios:
mkfifo mypipe
command1 > mypipe &
command2 < mypipe
This creates a named pipe that can be used between processes that aren't directly connected.
Process Substitution
Process substitution is a related feature that allows the output of a process to appear as a file:
diff <(ls dir1) <(ls dir2)
Explanation: This compares the directory listings of dir1 and dir2 without creating temporary files.
tee Command with Pipes
The tee
command allows you to save pipeline output to a file while also sending it to the next command:
command1 | tee output.txt | command2
Example:
ls | tee files.txt | grep ".txt"
Explanation: This lists all files, saves the complete listing to files.txt, and then filters to show only .txt files on screen.
Visualizing Pipelines
Here's a diagram that shows how data flows through a pipe:
Common Pitfalls and Tips
Avoiding Common Mistakes
-
Error Output Not Piped: By default, pipes only connect standard output (stdout) to standard input (stdin). Error messages (stderr) are not piped. To include error output:
bashcommand1 2>&1 | command2
-
Binary Data in Pipes: Be careful when piping binary data; some commands expect text input.
-
Pipeline Efficiency: Remember that all parts of a pipeline run simultaneously, not sequentially.
Debugging Pipelines
For complex pipelines, build them step by step:
# Instead of:
command1 | command2 | command3 | command4
# Debug by adding steps:
command1 > step1.txt
cat step1.txt | command2 > step2.txt
cat step2.txt | command3 > step3.txt
cat step3.txt | command4
Summary
Piping is a fundamental concept in Ubuntu and Unix-like systems that allows you to:
- Connect multiple commands together
- Process and transform data efficiently
- Build complex workflows from simple tools
- Automate repetitive tasks
- Analyze and filter large amounts of information
By mastering the pipe operator, you can dramatically increase your productivity in the terminal and solve complex problems with elegant command chains.
Exercises
- List all running processes, then pipe the output to
grep
to find processes containing "bash". - Find all
.log
files in/var/log
, sort them by size, and display the top 3 largest log files. - Count how many packages are installed on your system using
dpkg -l
and appropriate pipes. - Create a pipeline that finds the 5 largest subdirectories in your home directory.
- Use the
history
command with pipes to find how many times you've used thels
command.
Additional Resources
- The Linux Documentation Project: Advanced Bash-Scripting Guide
- GNU Coreutils Documentation: Text Processing Commands
- Ubuntu Community Help: Command Line Usage
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)