Skip to main content

Linux System Monitoring

As a DevOps engineer, you need to know what’s happening on your servers. Is the CPU high? Is memory full? Is the disk slow?

1. Real-Time Monitoring

top / htop

The task manager of Linux. Shows CPU, Memory, and running processes.
  • htop is a more user-friendly, colorful version of top.
# Install htop
sudo apt install htop

# Run it
htop
Key Metrics:
  • Load Average: System load over 1, 5, and 15 minutes. (If load > number of cores, system is overloaded).
  • Mem: RAM usage. Green is used, Yellow is cache (good!).

free

Check memory usage.
free -h
#      total        used        free      shared  buff/cache   available
# Mem: 15Gi       4.2Gi       5.1Gi       200Mi       6.1Gi       10Gi

2. Resource Analysis

vmstat (Virtual Memory Statistics)

Great for spotting bottlenecks.
vmstat 1  # Update every 1 second
  • r: Processes waiting for CPU (High number = CPU bottleneck).
  • si/so: Swap in/out (Any number > 0 = RAM bottleneck).

iostat (Input/Output Statistics)

Check disk performance.
iostat -xz 1
  • %util: Percentage of time the disk was busy. If near 100%, disk is the bottleneck.

3. Network Monitoring

netstat / ss

View active network connections.
# List all listening ports (TCP/UDP)
ss -tulpn

iftop

Bandwidth usage by connection (like top for network).

4. Log Management

Logs are the source of truth for errors.

journalctl (Systemd Logs)

View logs for services managed by systemd.
# View all logs
journalctl

# View logs for a specific service
journalctl -u nginx

# Follow logs (tail -f)
journalctl -u nginx -f

# View logs from today
journalctl --since "today"

/var/log

Traditional log directory.
  • /var/log/syslog: General system logs (Ubuntu/Debian).
  • /var/log/auth.log: Authentication logs (sudo, ssh logins).
  • /var/log/nginx/access.log: Web server logs.

Key Takeaways

  • Use htop for a quick overview.
  • Use vmstat to distinguish CPU vs RAM issues.
  • Use iostat to find disk bottlenecks.
  • Use journalctl to debug service failures.

Next: Security Hardening →