This document demonstrates practical troubleshooting of a slow web server using benchmarking tools, process monitoring, priority adjustment, and script optimization to identify and resolve CPU overload caused by parallel video transcoding processes.
This document walks through a real-world web server performance investigation, from initial user report to resolution. It demonstrates using Apache Benchmark for performance measurement, analyzing process loads with top, adjusting process priorities, investigating script automation, and implementing sequential processing to eliminate CPU overload.
A user has alerted that one of the web servers is being slow. The investigation begins by navigating to the website and loading the page. The page loads, but appears to be slow, though it’s hard to measure subjectively.
To quantify the slowness, the Apache Benchmark tool (ab) is used. This tool is super useful for checking if a website is behaving as expected or not. It makes a bunch of requests and summarizes the results once it’s done.
Command Syntax:
1ab -n 500 http://site.example.com
This command requests 500 iterations to get an average timing measurement. There are many more options available, such as controlling how many requests run simultaneously or setting timeouts if not all requests complete.
Making 500 requests allows for calculating an average response time, which provides a reliable baseline for determining if performance is actually degraded.
Note
Apache Benchmark (ab) is included with Apache HTTP Server utilities and provides statistical analysis of web server performance including min, mean, and max response times.
After the tool finishes running the 500 requests, the data can be examined to determine if the server is actually slow.
Initial Benchmark Results:
| Metric | Value | Assessment |
|---|---|---|
| Mean time per request | 155 milliseconds | Abnormally high for simple website |
| Total requests | 500 | Completed successfully |
| Expected response time | < 50 milliseconds | Based on site complexity |
While 155 milliseconds is not a super huge number, it’s definitely more than expected for such a simple website. It seems that something is going on with the web server and further investigation is needed.
The next step is to connect to the web server and check what’s happening. The investigation starts by examining the output of the top command to identify suspicious activity.
The top output reveals critical information:
Observed Issues:
| Observation | Details | Significance |
|---|---|---|
| ffmpeg processes | Multiple instances running | Using all available CPU |
| Load average | ~30 | Severely overloaded system |
| CPU count | 2 processors | Normal load should be ≤ 2 |
The load average on Linux shows how much time the processor is busy at a given minute, with one meaning it was busy for the whole minute. This computer has two processors, so any number above two means that it’s overloaded. During each minute, there were more processes waiting for processor time than the processor had to give.
The ffmpeg program is used for video transcoding, which means converting files from one video format to another. This is a CPU-intensive process and seems like the likely culprit for the server being overloaded.
Important
Load average values significantly exceeding the CPU count indicate severe resource contention. A load of 30 on a 2-CPU system means 28 processes are constantly waiting for CPU time.
One approach to try is changing the process priorities so that the web server takes precedence. The process priorities in Linux are structured so that the lower the number, the higher the priority. Typical numbers go from 0 to 19. By default, processes start with a priority of zero.
| Command | Purpose | Usage |
|---|---|---|
nice | Starting a process with different priority | nice -n 19 command |
renice | Changing priority of running process | renice 19 PID |
pidof | Getting process IDs by name | pidof process_name |
Rather than manually adjusting each process one by one (which would be manual, error-prone, and super boring), a shell script can automate this:
1for pid in $(pidof ffmpeg); do renice 19 $pid; done
Script Breakdown:
pidof ffmpeg: Returns all process IDs that have the name ffmpegfor pid in $(...): Iterates over each returned process IDrenice 19 $pid: Sets priority to 19 (lowest possible priority)The priorities for those processes are successfully updated.
Running the benchmarking software again to check if priority adjustment made any difference:
Post-Renice Results:
| Metric | Before | After Priority Change | Change |
|---|---|---|---|
| Mean time per request | 155 ms | 153 ms | -2 ms (negligible) |
The renice didn’t help significantly. Apparently, the OS is still giving these ffmpeg processes way too much processor time. The website is still slow.
Warning
Process priority adjustment alone may not resolve severe CPU contention. When too many CPU-intensive processes run simultaneously, even low-priority processes can consume significant resources.
These transcoding processes are CPU intensive, and running them in parallel is overloading the computer. A better approach is to modify whatever’s triggering them to run one after the other instead of all at the same time.
To implement this change, the investigation needs to find out how these processes got started.
Examining process details with ps ax | less shows all running processes on the computer. Using less allows scrolling through the output.
Within less, use /ffmpeg to search. The results show multiple ffmpeg processes converting videos from webm format to mp4 format.
Since the location of these videos on the hard drive is unknown, the locate command can help:
1locate static/001.webm
Result: The static directory is located in /server/deploy/videos/
Changing into the deploy directory and searching for the automation script:
1cd /server/deploy/videos/
2grep -r ffmpeg *
Search Results:
The deploy.sh file contains multiple mentions of ffmpeg.
Using vim (a command-line editor, since the connection is remote) to examine the file:
1vim deploy.sh
Script Analysis:
The script starts ffmpeg processes in parallel using a tool called daemonize that runs each program separately as if it were a daemon. This might be okay for converting a couple of videos, but launching one separate process for each video in the static directory is overloading the server.
| Approach | Behavior | Impact |
|---|---|---|
| Parallel (daemonize) | All videos convert simultaneously | CPU overload, server unresponsive |
| Sequential | One video converts at a time | Manageable CPU load, server responsive |
The fix is to change the script to run only one video conversion process at a time. This is accomplished by deleting the daemonized part and keeping the part that calls ffmpeg.
Before:
1daemonize ffmpeg -i input.webm output.mp4
After:
1ffmpeg -i input.webm output.mp4
The file is saved and exited.
Note
Modifying the script prevents future overload, but doesn’t change processes that are already running. Active process management is still required.
The script has been modified, but this won’t change the processes that are already running. These processes need to be stopped, but not canceled completely, as doing so would mean that the videos being converted right now will be incomplete.
The killall command with the -STOP flag sends a stop signal but doesn’t kill the processes completely:
1killall -STOP ffmpeg
This suspends all running ffmpeg processes without terminating them.
The goal is to run these processes one at a time. This could be done by sending the CONT signal to one process, waiting until it’s done, and then sending it to the next one. But that’s a lot of manual work that can be automated.
Automated Sequential Processing Script:
1for pid in $(pidof ffmpeg); do
2 while kill -CONT $pid 2>/dev/null; do
3 sleep 1
4 done
5done
Script Logic Breakdown:
| Component | Purpose |
|---|---|
for pid in $(pidof ffmpeg) | Iterate through all ffmpeg process IDs |
while kill -CONT $pid 2>/dev/null | Send CONT signal; succeeds while process exists |
sleep 1 | Wait one second before next check |
| Loop exit | When process finishes, kill command fails, exits while loop |
How It Works:
pidof commandsleep 1 waits one second until the next checkThe server is now running one ffmpeg process at a time.
Running the benchmark one more time to verify the fix:
1ab -n 500 http://site.example.com
Final Results:
| Metric | Initial | After Renice | After Sequential Processing | Improvement |
|---|---|---|---|---|
| Mean time per request | 155 ms | 153 ms | 33 ms | 78.7% faster |
The mean time is now 33 milliseconds. That’s much lower than before. The web server has been successfully restored to reply promptly to requests again.
Important
The solution reduced response time from 155ms to 33ms (a 78.7% improvement) by eliminating CPU contention rather than just adjusting priorities. This demonstrates that fixing the root cause is more effective than symptomatic treatment.
Several different approaches were demonstrated for situations where the code can’t be fixed:
| Approach | Implementation | Result |
|---|---|---|
| Process priority adjustment (renice) | Change process priorities to favor web server | Minimal improvement (155ms → 153ms) |
| Sequential processing | Run CPU-intensive tasks one at a time | Significant improvement (155ms → 33ms) |
The key lesson is that when parallel CPU-intensive processes overload a system, sequential execution is often necessary. Process priority adjustment alone is insufficient when resource contention is severe.
Techniques Demonstrated:
Troubleshooting the slow web server required systematic investigation from initial performance measurement through root cause identification to solution implementation. The Apache Benchmark tool provided quantifiable metrics showing 155ms average response time. Investigation with top revealed massive CPU overload from parallel ffmpeg video transcoding processes, with load averages around 30 on a 2-CPU system. The first mitigation attempt using renice to lower ffmpeg process priorities yielded minimal improvement. Deeper investigation using ps, locate, and grep revealed a deployment script launching all video conversions in parallel using daemonize. The solution involved modifying the script to remove parallelization, managing already-running processes with killall -STOP, and implementing a shell script to resume processes sequentially. Final benchmarking confirmed success with response times dropping to 33ms, a 78.7% improvement. This case study demonstrates that addressing root causes (parallel resource contention) is far more effective than symptomatic treatments (priority adjustment), and showcases essential Linux troubleshooting tools and techniques for performance optimization.