Troubleshooting Common Issues on Ubuntu Server: A Practical Guide

Introduction

Ubuntu Server is a reliable and widely-used Linux distribution, but like any operating system, it can encounter issues that disrupt operations. Knowing how to diagnose and fix common server problems is essential for maintaining uptime and ensuring smooth operations.

In this guide, we’ll cover practical troubleshooting techniques for resolving common Ubuntu Server issues, including boot problems, SSH errors, network connectivity issues, and more. Additionally, we’ll guide you on how to leverage basic logging to identify the root cause of issues quickly.


Common Ubuntu Server Issues and How to Troubleshoot

1. Boot Problems

Boot issues can occur due to corrupted files, hardware failures, or configuration errors.

Symptoms:

  • Server fails to boot and displays a blank screen or error messages.
  • The GRUB menu is inaccessible or missing.
  • Kernel panic errors appear during startup.

Troubleshooting Steps:

Step 1: Access GRUB Menu

Restart the server and hold down the Shift key (or Esc key for UEFI systems) to access the GRUB menu.

Step 2: Boot into Recovery Mode

Select Advanced Options > Recovery Mode to boot into a minimal environment for troubleshooting.

Step 3: Check System Logs

Inspect the system logs for errors related to the boot process:

sudo journalctl -b  

Look for errors or warnings related to kernel modules, disk mounting, or services.

Step 4: Repair Broken Packages

Run the following commands to fix broken packages:

sudo apt update  
sudo apt install --fix-broken  

Step 5: Reinstall GRUB

If GRUB is corrupted, reinstall it:

sudo grub-install /dev/sda  
sudo update-grub  

Step 6: Check Disk Health

Use fsck to check and repair disk errors:

sudo fsck /dev/sda  


2. SSH Errors

SSH errors can prevent remote access to the server, disrupting management and automation tasks.

Symptoms:

  • Unable to connect to the server via SSH.
  • “Connection refused” or “Permission denied” errors.
  • SSH hangs or times out.

Troubleshooting Steps:

Step 1: Verify SSH Service

Ensure the SSH service is running:

sudo systemctl status ssh  

If the service is inactive, start it:

sudo systemctl start ssh  

Step 2: Check SSH Logs

Inspect the SSH logs for detailed error messages:

sudo tail -n 50 /var/log/auth.log  

Common issues include:

  • Authentication failures.
  • IP address blocks due to fail2ban or firewall rules.
Step 3: Check Firewall Rules

Ensure the firewall allows SSH traffic:

sudo ufw allow OpenSSH  
sudo ufw status  

Step 4: Inspect SSH Configuration

Check for errors in the SSH configuration file:

sudo nano /etc/ssh/sshd_config  

Restart SSH after making changes:

sudo systemctl restart ssh  

Step 5: Verify Network Connectivity

Ensure the server is reachable:

ping <server-ip>  


3. Network Connectivity Issues

Network issues can prevent the server from accessing the internet or communicating with other devices.

Symptoms:

  • Unable to ping external websites or other servers.
  • DNS resolution errors.
  • Network interfaces are down or misconfigured.

Troubleshooting Steps:

Step 1: Check Network Interfaces

List active network interfaces:

ip a  

If an interface is down, bring it up:

sudo ip link set <interface-name> up  

Step 2: Check Logs for Network Errors

Inspect network-related logs for potential issues:

sudo journalctl -u systemd-networkd

Step 3: Verify IP Configuration

Check the server’s IP address, gateway, and DNS settings:

cat /etc/netplan/*.yaml  

If the configuration is incorrect, edit the file and apply changes:

sudo nano /etc/netplan/*.yaml  
sudo netplan apply  

Example Netplan Configuration:
network:  
  version: 2  
  ethernets:  
    eth0:  
      dhcp4: true  

Step 4: Test DNS Resolution

Verify DNS resolution:

nslookup google.com  

If DNS resolution fails, update /etc/resolv.conf with valid DNS servers:

nameserver 8.8.8.8  
nameserver 8.8.4.4  


4. Disk Space Issues

Low disk space can cause applications to crash or prevent updates.

Symptoms:

  • “No space left on device” errors.
  • Server becomes unresponsive.

Troubleshooting Steps:

Step 1: Check Disk Usage

Use df to check disk usage:

df -h  

Step 2: Identify Large Files

Find large files consuming disk space:

sudo du -ah / | sort -rh | head -n 10  

Step 3: Check Log File Sizes

Log files can grow excessively large. Check /var/log for oversized logs:

sudo du -sh /var/log/*  

Remove or archive old logs:

sudo rm -rf /var/log/*.gz  
sudo rm -rf /var/log/*.log.old  

Step 4: Extend Disk Space

If the disk is full, extend the volume (cloud platforms like AWS, Azure, and GCP allow resizing disks).


5. Package Installation Errors

Package installation errors can occur due to broken dependencies or repository issues.

Symptoms:

  • “Unable to locate package” or “Dependency errors” during installation.

Troubleshooting Steps:

Step 1: Check Logs for Package Errors

Inspect apt logs for detailed errors:

sudo tail -n 50 /var/log/apt/history.log  

Step 2: Update Package Index

Ensure the package index is up-to-date:

sudo apt update  

Step 3: Fix Broken Dependencies

Run the following command to resolve dependency issues:

sudo apt install --fix-broken  

Step 4: Add Missing Repositories

If a package is unavailable, add the required repository:

sudo add-apt-repository universe  
sudo apt update  


6. High CPU or Memory Usage

Excessive resource usage can cause performance degradation and slow response times.

Symptoms:

  • Server becomes unresponsive.
  • Applications crash or timeout.

Troubleshooting Steps:

Step 1: Monitor Resource Usage

Use top or htop to monitor CPU and memory usage:

top  

Step 2: Check Logs for Application Errors

Inspect application-specific logs in /var/log for errors causing high resource usage.

Step 3: Identify Resource-Hungry Processes

Find processes consuming excessive resources:

ps aux --sort=-%cpu | head -n 10  
ps aux --sort=-%mem | head -n 10  

Step 4: Kill Unnecessary Processes

Terminate resource-hungry processes:

sudo kill <pid>  


Using Logs for Troubleshooting

Logs are one of the most valuable tools when troubleshooting server issues. Here’s how to check logs for common problems:

  1. System Logs: sudo journalctl -xe Provides detailed logs about system-level errors and warnings.
  2. Authentication Logs: sudo tail -n 50 /var/log/auth.log Useful for diagnosing SSH and login-related issues.
  3. Application-Specific Logs:
    Check /var/log for application-specific logs (e.g., Apache, MySQL).
  4. Kernel Logs: sudo dmesg Displays kernel-related messages, useful for hardware and boot issues.

Conclusion

Troubleshooting Ubuntu Server issues requires a systematic approach to diagnosing and resolving problems. In this guide, we covered:

  • Resolving boot problems, SSH errors, and network connectivity issues.
  • Fixing disk space and package installation errors.
  • Using logs effectively to diagnose issues.

By following these steps and leveraging logs, you can identify and fix server problems efficiently, ensuring a stable and reliable Ubuntu Server environment.

Leave a Comment