Solving High CPU Usage Issues with Incus LXC Containers
Published:
Recently, after setting up several LXC sub-containers on CLAW JP, I noticed that occasionally the CPU would hit 100% usage. Using the top command, I discovered that the incus process was consuming an abnormal amount of CPU:
incusd --group incus-admin --logfile /var/log/incus/incusd.log
After checking each individual LXC instance, I found that the CPU usage within the specific instances wasn’t high, and since the instances have CPU usage limits, they theoretically shouldn’t be able to reach 100%. Memory was far from exhausted, disk space was plentiful—the situation was quite puzzling.
Solution
I asked Claude to write a monitoring script that checks the CPU usage of the incusd process every minute. If the usage exceeds 80% for three consecutive minutes, the script automatically restarts the incusd service. After implementing this, I’ll monitor whether the issue recurs.
Here’s the script:
#!/bin/bash
# Script file: /root/monitor_incusd_top.sh
# Function: Check the CPU usage of the incusd process every minute (using top sampling), and restart the service if it exceeds 80% for 3 consecutive minutes
# Set log file and counter file paths
LOG_FILE="/root/incusd_monitor.log"
COUNTER_FILE="/tmp/incusd_high_cpu_counter"
# If the counter file doesn't exist, create and initialize it to 0
if [ ! -f "$COUNTER_FILE" ]; then
echo "0" > "$COUNTER_FILE"
fi
# Get current time
CURRENT_TIME=$(date "+%Y-%m-%d %H:%M:%S")
# Use top sampling, note: the output fields of top may vary in different systems,
# generally the CPU usage is in the 9th column, adjust according to actual situation
sleep 1
CPU_USAGE=$(top -b -n1 | grep incusd | grep -v grep | awk '{print $9}' | sort -nr | head -n 1)
# Check if the incusd process was found
if [ -z "$CPU_USAGE" ]; then
echo "$CURRENT_TIME - incusd process not running" >> "$LOG_FILE"
echo "0" > "$COUNTER_FILE"
exit 0
fi
# If possible, print the sampling value format (remove percentage sign if present)
CPU_USAGE_NUM=$(echo "$CPU_USAGE" | sed 's/%//g')
# Use bc to compare CPU_USAGE with 80.0 (returns 1 if true)
if [ "$(echo "$CPU_USAGE_NUM > 80.0" | bc -l)" -eq 1 ]; then
# Increment counter
COUNTER=$(<"$COUNTER_FILE")
COUNTER=$((COUNTER + 1))
echo "$COUNTER" > "$COUNTER_FILE"
echo "$CURRENT_TIME - incusd process highest CPU usage: $CPU_USAGE_NUM%, counter: $COUNTER" >> "$LOG_FILE"
# If high CPU usage is detected 3 consecutive times, restart the service
if [ "$COUNTER" -ge 3 ]; then
echo "$CURRENT_TIME - incusd process CPU usage exceeded 80% for 3 consecutive minutes, restarting service..." >> "$LOG_FILE"
systemctl restart incus >> "$LOG_FILE" 2>&1
echo "$CURRENT_TIME - incus service restart completed" >> "$LOG_FILE"
# Reset counter
echo "0" > "$COUNTER_FILE"
fi
else
# CPU usage normal, reset counter
echo "0" > "$COUNTER_FILE"
echo "$CURRENT_TIME - incusd process highest CPU usage: $CPU_USAGE_NUM%, status normal" >> "$LOG_FILE"
fi