Ubuntu Server Monitoring

Introduction

Effective server monitoring is a critical aspect of maintaining a healthy and reliable Ubuntu server environment. Monitoring enables you to track system performance, identify potential issues before they become critical, and ensure optimal operation of your services. This guide will introduce you to the fundamentals of Ubuntu server monitoring, essential tools, and practical implementations to help you establish a robust monitoring strategy.

Server monitoring involves tracking various metrics and aspects of your system:

Resource utilization: CPU, memory, disk, and network usage
Service availability: Ensuring critical services remain operational
Performance metrics: Response times and throughput
Security monitoring: Detecting suspicious activities
Log analysis: Tracking system and application logs

Whether you're managing a single server or a fleet of machines, understanding these monitoring concepts and tools will help you maintain a reliable infrastructure.

Basic System Monitoring Tools

Ubuntu comes with several built-in tools that provide immediate insights into your system's health.

The `top` Command

The top command provides a real-time, dynamic view of system performance:

top

Example output:

top - 14:23:45 up 23 days, 21:52, 1 user, load average: 0.08, 0.03, 0.01
Tasks: 128 total,   1 running, 127 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.7 us,  0.3 sy,  0.0 ni, 97.9 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
MiB Mem :   3959.7 total,    190.1 free,   1536.3 used,   2233.3 buff/cache
MiB Swap:   2048.0 total,   2047.7 free,      0.3 used.   2138.6 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 1022 root      20   0  725204  41672  21592 S   0.7   1.0 107:34.83 systemd-journald
 1536 mysql     20   0 1755972 352072  41464 S   0.7   8.7  60:24.51 mysqld
 2104 www-data  20   0  705984  68464  42416 S   0.7   1.7   1:12.64 apache2

Understanding this output:

Load average: Numbers representing system load over 1, 5, and 15 minutes
CPU usage: Broken down by user processes, system processes, idle time, etc.
Memory usage: Total, free, used, and cached memory
Process information: PID, user, CPU and memory usage per process

The `htop` Tool

A more user-friendly alternative to top is htop, which you can install with:

sudo apt install htop

Run it with:

htop

htop provides color-coding, visual meters for CPU/memory usage, and easier process management with function key shortcuts displayed at the bottom.

Disk Usage Monitoring

Monitor disk space with the df command:

df -h

Example output:

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        30G   12G   17G  42% /
/dev/sda2       915G  253G  616G  30% /data
tmpfs           2.0G     0  2.0G   0% /dev/shm

For detailed disk usage by directory, use du:

du -sh /var/log

Output example:

126M    /var/log

System Resource Usage with `vmstat`

The vmstat command gives you information about system memory, processes, paging, block I/O, and CPU activity:

vmstat 1 5

This runs vmstat every second for 5 iterations.

Example output:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
0      0 194348  96168 2188608    0    0     0     5    1    1  2  0 98  0  0
0      0 194348  96168 2188608    0    0     0     0  102  168  1  0 99  0  0
0      0 194348  96168 2188608    0    0     0     0  107  162  1  0 99  0  0
0      0 194348  96168 2188608    0    0     0     0  117  180  1  0 99  0  0
0      0 194348  96168 2188608    0    0     0    24  106  177  1  0 99  0  0

Network Statistics with `netstat` and `ss`

Monitor network connections with:

ss -tuln

Example output:

Netid  State   Recv-Q  Send-Q     Local Address:Port     Peer Address:Port  Process
udp    UNCONN  0       0            127.0.0.1:323           0.0.0.0:*
udp    UNCONN  0       0                [::1]:323              [::]:*
tcp    LISTEN  0       4096         127.0.0.1:3306          0.0.0.0:*
tcp    LISTEN  0       128            0.0.0.0:22            0.0.0.0:*
tcp    LISTEN  0       511            0.0.0.0:80            0.0.0.0:*
tcp    LISTEN  0       128               [::]:22               [::]:*
tcp    LISTEN  0       511               [::]:80               [::]:*

Advanced Monitoring with Specialized Tools

For more comprehensive monitoring, several dedicated tools are available:

Prometheus and Grafana

These tools provide powerful monitoring and visualization capabilities:

Install Prometheus:

sudo apt install prometheus prometheus-node-exporter

Install Grafana:

sudo apt-get install -y software-properties-common
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
sudo apt-get update
sudo apt-get install grafana

Start the services:

sudo systemctl enable prometheus prometheus-node-exporter grafana-server
sudo systemctl start prometheus prometheus-node-exporter grafana-server

With these tools, you can create customized dashboards to monitor virtually any aspect of your Ubuntu server.

Here's a simplified view of how Prometheus and Grafana work together:

Netdata

Netdata is a lightweight, real-time performance monitoring tool:

# Install dependencies
sudo apt install -y zlib1g-dev uuid-dev libuv1-dev liblz4-dev libjudy-dev libssl-dev libmnl-dev gcc make autoconf autoconf-archive autogen automake pkg-config curl

# Install Netdata
bash <(curl -Ss https://my-netdata.io/kickstart.sh)

After installation, access Netdata's web interface at http://your-server-ip:19999.

Nagios Core

Nagios is a powerful monitoring system for servers, networks, and infrastructure:

# Install dependencies
sudo apt install -y build-essential libgd-dev openssl libssl-dev unzip apache2 php libapache2-mod-php php-gd libgd-dev

# Download and install Nagios
cd /tmp
wget https://github.com/NagiosEnterprises/nagioscore/archive/nagios-4.4.6.tar.gz
tar xzf nagios-4.4.6.tar.gz
cd nagioscore-nagios-4.4.6/
./configure --with-httpd-conf=/etc/apache2/sites-enabled
make all
sudo make install

Configure Nagios according to your needs, set up monitoring targets, and access its web interface for a comprehensive view of your server environment.

Log Monitoring and Analysis

Logs provide critical information about system events. Here's how to effectively monitor them:

Basic Log Viewing

The most straightforward way to view logs is using the cat, less, or tail commands:

# View the entire log
less /var/log/syslog

# View the last 50 lines
tail -n 50 /var/log/syslog

# Follow new log entries in real-time
tail -f /var/log/syslog

Using `journalctl` for Systemd Logs

Modern Ubuntu systems use systemd, which has its own logging system accessed through journalctl:

# View all logs
journalctl

# View logs for a specific service
journalctl -u apache2

# View logs from the current boot
journalctl -b

# View logs since a specific time
journalctl --since "2023-03-20 10:00:00"

Log Monitoring with `logwatch`

logwatch provides a daily report of log activities:

sudo apt install logwatch

# Generate a report for the past day
sudo logwatch --output stdout --format html --range yesterday --detail high > ~/log_report.html

Setting Up Automated Monitoring and Alerts

To be proactive about server issues, set up automated monitoring and alerting:

Monitoring Disk Space and Sending Email Alerts

Create a simple shell script to monitor disk usage:

#!/bin/bash
# disk_monitor.sh

THRESHOLD=90
EMAIL="[email protected]"

df -h | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $5 " " $1 }' | while read output;
do
  usep=$(echo $output | awk '{ print $1}' | cut -d'%' -f1 )
  partition=$(echo $output | awk '{ print $2 }' )
  if [ $usep -ge $THRESHOLD ]; then
    echo "Running out of space \"$partition ($usep%)\" on $(hostname) as on $(date)" | 
    mail -s "Disk Space Alert: $partition on $(hostname) is $usep% full" $EMAIL
  fi
done

Make it executable and add to crontab:

chmod +x disk_monitor.sh
crontab -e

# Add this line to run it every hour
0 * * * * /path/to/disk_monitor.sh

Setting Up Server Monitoring with Monit

Monit is a simple yet powerful tool for monitoring services:

sudo apt install monit

Configure Monit by editing /etc/monit/monitrc:

set daemon 120            # Check services every 2 minutes
set logfile /var/log/monit.log
set idfile /var/lib/monit/id
set statefile /var/lib/monit/state

set httpd port 2812 and
    use address localhost  # Only accept connection from localhost
    allow localhost        # Allow localhost to connect

check system $HOST
    if loadavg (5min) > 3 then alert
    if memory usage > 80% then alert
    if cpu usage (user) > 70% then alert

check process apache with pidfile /var/run/apache2/apache2.pid
    start program = "/etc/init.d/apache2 start"
    stop program  = "/etc/init.d/apache2 stop"
    if failed host 127.0.0.1 port 80 protocol http then restart
    if cpu > 60% for 2 cycles then alert
    if cpu > 80% for 5 cycles then restart

Start Monit:

sudo systemctl enable monit
sudo systemctl start monit

Best Practices for Server Monitoring

To establish an effective monitoring strategy:

Define critical metrics: Identify what's most important for your specific server use case.
Set appropriate thresholds: Configure alerts based on realistic thresholds to avoid alert fatigue.
Implement a centralized monitoring solution: For multiple servers, use a centralized tool like Prometheus or Nagios.
Create a response plan: Have clear procedures for when alerts are triggered.
Regular review: Periodically review your monitoring setup to ensure it meets your evolving needs.
Combine monitoring tools: Different tools have different strengths; use them in combination.
Implement log rotation: Prevent logs from consuming all disk space.

sudo nano /etc/logrotate.conf

Monitor from outside your network: External monitoring can detect issues that internal monitoring might miss.

Monitoring Workflow

Here's a typical monitoring workflow for an Ubuntu server:

Summary

Ubuntu server monitoring is essential for maintaining system health and preventing downtime. By utilizing built-in tools like top, htop, and df, along with specialized solutions like Prometheus, Grafana, and Monit, you can establish comprehensive visibility into your server's performance.

Remember to:

Monitor key resources: CPU, memory, disk, and network
Analyze logs for troubleshooting and security
Set up automated alerts for proactive management
Follow best practices for effective monitoring

Effective monitoring transforms server administration from a reactive to a proactive discipline, ensuring your Ubuntu server provides reliable service to your users.

Additional Resources

Practice Exercises

Set up Netdata on your Ubuntu server and create a custom dashboard focused on web server metrics.
Write a bash script that monitors memory usage and sends an alert when it exceeds 80%.
Configure log rotation for a custom application log file.
Set up Prometheus and Grafana to monitor multiple Ubuntu servers simultaneously.
Create a comprehensive monitoring plan for a web server hosting a high-traffic website, detailing which metrics to track and appropriate thresholds.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Basic System Monitoring Tools​

The top Command​

The htop Tool​

Disk Usage Monitoring​

System Resource Usage with vmstat​

Network Statistics with netstat and ss​

Advanced Monitoring with Specialized Tools​

Prometheus and Grafana​

Netdata​

Nagios Core​

Log Monitoring and Analysis​

Basic Log Viewing​

Using journalctl for Systemd Logs​

Log Monitoring with logwatch​

Setting Up Automated Monitoring and Alerts​

Monitoring Disk Space and Sending Email Alerts​

Setting Up Server Monitoring with Monit​

Best Practices for Server Monitoring​

Monitoring Workflow​

Summary​

Additional Resources​

Practice Exercises​

Introduction

Basic System Monitoring Tools

The `top` Command

The `htop` Tool

Disk Usage Monitoring

System Resource Usage with `vmstat`

Network Statistics with `netstat` and `ss`

Advanced Monitoring with Specialized Tools

Prometheus and Grafana

Netdata

Nagios Core

Log Monitoring and Analysis

Basic Log Viewing

Using `journalctl` for Systemd Logs

Log Monitoring with `logwatch`

Setting Up Automated Monitoring and Alerts

Monitoring Disk Space and Sending Email Alerts

Setting Up Server Monitoring with Monit

Best Practices for Server Monitoring

Monitoring Workflow

Summary

Additional Resources

Practice Exercises