Auto Reboot Server in Linux

Sometimes a Linux server can get overwhelmed by sustained high CPU / Memory load β€” due to runaway processes, DDoS attacks or rogue scripts. Manually catching and fixing it in real-time isn't always possible. This guide will show you how to automatically reboot your server when CPU load is critically high for multiple minutes β€” but safely.

We’ll create a lightweight bash script, track load over time, and schedule it with cron to run every minute.

πŸ“œ Step 1: Create the Auto-Reboot Script
Let’s start by writing the script that checks system load and initiates a reboot after 3 consecutive high-load readings.

✏️ Create or Edit the Script

sudo nano /usr/local/bin/reboot-on-high-load.sh

Paste this logic into the file:

#!/bin/bash

# Configuration
CPU_THRESHOLD=0.8                # 80% load per CPU core
MEM_THRESHOLD=90                 # 90% memory usage threshold
MAX_RETRIES=3                    # Reboot if high load/memory persists N times
CHECK_FILE="/tmp/highload.counter"
LOG_FILE="/var/log/reboot-load.log"
MAX_LOG_SIZE=$((10 * 1024 * 1024))  # 10MB

# Auto-truncate log if too large
if [ -f "$LOG_FILE" ] && [ "$(stat -c%s "$LOG_FILE")" -ge "$MAX_LOG_SIZE" ]; then
    echo "[!] Log file exceeded 10MB. Truncating..." >> "$LOG_FILE"
    truncate -s 0 "$LOG_FILE"
fi

# Detect CPU cores and load threshold
CPU_CORES=$(nproc)
LOAD_THRESHOLD=$(echo "$CPU_CORES * $CPU_THRESHOLD" | bc)

# CPU Load Check
LOAD_AVG=$(awk '{print $1}' /proc/loadavg)
LOAD_OK=$(echo "$LOAD_AVG < $LOAD_THRESHOLD" | bc)

# Memory Check
MEM_USED_PERCENT=$(free | awk '/Mem:/ { printf("%.0f", $3/$2 * 100) }')
MEM_OK=$( [ "$MEM_USED_PERCENT" -lt "$MEM_THRESHOLD" ] && echo 1 || echo 0 )

# Initialize counter file if needed
if [ ! -f "$CHECK_FILE" ]; then
    echo 0 > "$CHECK_FILE"
fi

COUNTER=$(cat "$CHECK_FILE")

# Evaluate system status
if [ "$LOAD_OK" -eq 0 ] || [ "$MEM_OK" -eq 0 ]; then
    echo "[!] High resource usage detected at $(date):" >> "$LOG_FILE"
    [ "$LOAD_OK" -eq 0 ] && echo "    - CPU Load: $LOAD_AVG / Threshold: $LOAD_THRESHOLD" >> "$LOG_FILE"
    [ "$MEM_OK" -eq 0 ] && echo "    - Memory Usage: $MEM_USED_PERCENT% / Threshold: $MEM_THRESHOLD%" >> "$LOG_FILE"
    COUNTER=$((COUNTER + 1))
    echo "$COUNTER" > "$CHECK_FILE"
else
    if [ "$COUNTER" -ne 0 ]; then
        echo "[βœ“] Resources back to normal at $(date): Load = $LOAD_AVG, Mem = $MEM_USED_PERCENT%" >> "$LOG_FILE"
    fi
    echo 0 > "$CHECK_FILE"
fi

# Reboot if over threshold for too long
if [ "$COUNTER" -ge "$MAX_RETRIES" ]; then
    echo "[!!!] High resource usage sustained for $MAX_RETRIES checks. Rebooting at $(date)..." >> "$LOG_FILE"
    rm -f "$CHECK_FILE"
    /sbin/shutdown -r now
fi

πŸ”“ Step 2: Make the Script Executable

sudo chmod +x /usr/local/bin/reboot-on-high-load.sh

⏰ Step 3: Schedule It to Run Every Minute
Open root crontab:

sudo crontab -e

Add this line to the bottom:

* * * * * /usr/local/bin/reboot-on-high-load.sh >> /var/log/reboot-load.log 2>&1

This schedules the script to run every minute and logs its output to /var/log/reboot-load.log.

πŸ›‘οΈ Why This Is Safe
βœ… No reboots on single spikes β€” It only reboots if load stays high for 3 checks.

βœ… Self-resetting β€” If load normalizes, the counter resets.

βœ… Persistent state tracking β€” Uses /tmp/highload.counter.

βœ… Simple logging β€” Outputs to /var/log/reboot-load.log.

βœ… Cron-scheduled β€” Lightweight, runs every 60 seconds.

πŸ§ͺ Optional: Test the Script by Simulating High Load
πŸ› οΈ 1. Install a CPU Stress Tool
On Ubuntu/Debian:

sudo apt update && sudo apt install -y stress

On Alpine Linux:

apk add stress

If stress is not available, use yes as a simple CPU loader (see below).

πŸš€ 2. Simulate High Load for Over 3 Minutes
Option A: With stress (preferred)

stress --cpu $(nproc) --timeout 200

--cpu $(nproc) starts 1 thread per core.

--timeout 200 runs for 200 seconds (~3.3 minutes).

Option B: With yes Command

for i in $(seq 1 $(nproc)); do yes > /dev/null & done

Let it run for at least 3 minutes, then stop it:

killall yes

πŸ“ 3. Monitor the Logs
In a second terminal, run:

tail -f /var/log/reboot-load.log

You’ll see output like:

yaml
Copy
Edit
[!] Load is high: 7.23 / Threshold: 6.40
[!] Load is high: 7.45 / Threshold: 6.40
[!!!] High load sustained for 3 checks. Rebooting...
Then the system will automatically reboot.

βœ… Recap
By following this guide, you’ve set up a safe and efficient way to auto-reboot your Linux server during persistent high-load events:

βœ… Script with thresholds and state tracking

βœ… Cronjob for regular checks

βœ… No false positives from single spikes

βœ… Full log output for auditing