Skip to main content

Ubuntu Piping

Introduction

Piping is one of the most powerful concepts in the Ubuntu terminal (and Linux systems in general). It allows you to connect multiple commands together, where the output of one command becomes the input to the next. This concept embodies the Unix philosophy of creating small, focused tools that can be combined to solve complex problems.

The pipe operator in Ubuntu is represented by the vertical bar symbol (|). When you see this symbol between commands, it means "take the output of the command on the left and feed it as input to the command on the right."

Basic Syntax

The basic syntax for piping is:

bash
command1 | command2

This means: run command1, take its output, and use it as input for command2.

Simple Examples

Let's start with some simple examples to understand the concept:

Example 1: Counting Files in a Directory

bash
ls | wc -l

Input:

$ ls | wc -l

Output:

42

Explanation:

  • ls lists all files and directories in the current directory
  • wc -l counts the number of lines in the input
  • Together, this command tells you how many files and directories are in your current directory

Example 2: Finding a Specific File

bash
ls | grep ".txt"

Input:

$ ls | grep ".txt"

Output:

notes.txt
report.txt
todo.txt

Explanation:

  • ls lists all files and directories
  • grep ".txt" filters the input, showing only lines containing ".txt"
  • The result is a list of only the text files in your directory

Chaining Multiple Pipes

You can chain multiple commands together using pipes, creating a pipeline of operations:

bash
command1 | command2 | command3

Example: Finding the 5 Largest Files

bash
du -h /home/user | sort -rh | head -5

Input:

$ du -h /home/user | sort -rh | head -5

Output:

1.2G    /home/user
800M /home/user/Videos
500M /home/user/Downloads
200M /home/user/Documents
150M /home/user/Pictures

Explanation:

  1. du -h /home/user lists disk usage of files and directories in human-readable format
  2. sort -rh sorts the output in reverse order (largest first) and interprets human-readable sizes
  3. head -5 shows only the first 5 lines of output
  4. The result is a list of the 5 largest directories or files in your home directory

Common Piping Patterns

Filtering Output

One of the most common uses of piping is filtering output:

bash
command | grep "pattern"

Example: Finding Running Python Processes

bash
ps aux | grep "python"

Input:

$ ps aux | grep "python"

Output:

user     1234  0.5  1.2  123456  12345 ?   S    10:30   0:25 python3 app.py
user 2345 0.0 0.3 45678 3456 pts/0 S+ 11:42 0:00 grep --color=auto python

Explanation:

  • ps aux lists all running processes
  • grep "python" filters to show only processes with "python" in their description

Counting and Summarizing

Piping is excellent for quick statistics:

bash
command | wc -l       # Count lines
command | sort | uniq # Find unique values

Example: Counting Unique Users Logged In

bash
who | cut -d' ' -f1 | sort | uniq | wc -l

Input:

$ who | cut -d' ' -f1 | sort | uniq | wc -l

Output:

3

Explanation:

  1. who shows who is logged in
  2. cut -d' ' -f1 extracts just the username field
  3. sort sorts the usernames
  4. uniq removes duplicate usernames
  5. wc -l counts the number of unique usernames
  6. The result tells you how many different users are currently logged in

Processing and Transforming Data

Pipes excel at data transformation:

bash
command | sed 's/old/new/g'   # Replace text
command | awk '{print $2}' # Extract specific fields

Example: Extracting the Second Column from a CSV File

bash
cat data.csv | cut -d',' -f2

Input:

$ cat data.csv | cut -d',' -f2

Output:

Name
John
Sarah
Michael

Explanation:

  • cat data.csv outputs the contents of the CSV file
  • cut -d',' -f2 extracts just the second column, using comma as a delimiter

Real-World Applications

Application 1: Log Analysis

Finding error messages in a large log file:

bash
cat /var/log/syslog | grep "ERROR" | tail -10

Explanation: This pipeline finds the last 10 error messages in the system log.

Application 2: System Monitoring

Monitoring top memory-consuming processes:

bash
ps aux | sort -nrk 4 | head -5

Explanation: This shows the 5 processes using the most memory (sorted by the 4th column of ps output).

Application 3: File Processing

Counting word frequency in a document:

bash
cat document.txt | tr -s ' ' '
' | sort | uniq -c | sort -nr | head -10

Explanation: This pipeline counts and displays the 10 most frequent words in a document:

  1. cat document.txt outputs the document
  2. tr -s ' ' ' ' replaces spaces with newlines, putting each word on its own line
  3. sort sorts all words alphabetically
  4. uniq -c counts occurrences of each word
  5. sort -nr sorts numerically in reverse order (highest count first)
  6. head -10 shows only the top 10 results

Advanced Piping Concepts

Named Pipes (FIFOs)

Beyond the basic pipe operator, Ubuntu supports named pipes (FIFOs) for more complex scenarios:

bash
mkfifo mypipe
command1 > mypipe &
command2 < mypipe

This creates a named pipe that can be used between processes that aren't directly connected.

Process Substitution

Process substitution is a related feature that allows the output of a process to appear as a file:

bash
diff <(ls dir1) <(ls dir2)

Explanation: This compares the directory listings of dir1 and dir2 without creating temporary files.

tee Command with Pipes

The tee command allows you to save pipeline output to a file while also sending it to the next command:

bash
command1 | tee output.txt | command2

Example:

bash
ls | tee files.txt | grep ".txt"

Explanation: This lists all files, saves the complete listing to files.txt, and then filters to show only .txt files on screen.

Visualizing Pipelines

Here's a diagram that shows how data flows through a pipe:

Common Pitfalls and Tips

Avoiding Common Mistakes

  1. Error Output Not Piped: By default, pipes only connect standard output (stdout) to standard input (stdin). Error messages (stderr) are not piped. To include error output:

    bash
    command1 2>&1 | command2
  2. Binary Data in Pipes: Be careful when piping binary data; some commands expect text input.

  3. Pipeline Efficiency: Remember that all parts of a pipeline run simultaneously, not sequentially.

Debugging Pipelines

For complex pipelines, build them step by step:

bash
# Instead of:
command1 | command2 | command3 | command4

# Debug by adding steps:
command1 > step1.txt
cat step1.txt | command2 > step2.txt
cat step2.txt | command3 > step3.txt
cat step3.txt | command4

Summary

Piping is a fundamental concept in Ubuntu and Unix-like systems that allows you to:

  • Connect multiple commands together
  • Process and transform data efficiently
  • Build complex workflows from simple tools
  • Automate repetitive tasks
  • Analyze and filter large amounts of information

By mastering the pipe operator, you can dramatically increase your productivity in the terminal and solve complex problems with elegant command chains.

Exercises

  1. List all running processes, then pipe the output to grep to find processes containing "bash".
  2. Find all .log files in /var/log, sort them by size, and display the top 3 largest log files.
  3. Count how many packages are installed on your system using dpkg -l and appropriate pipes.
  4. Create a pipeline that finds the 5 largest subdirectories in your home directory.
  5. Use the history command with pipes to find how many times you've used the ls command.

Additional Resources



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)