If you’re running an Apache webserver with many customer websites, there will be a time (sooner or later) where your server is flooded with a lot of page requests, causing a high CPU-load and memory usage. Specially if PHP or other scripting is used behind. Most of the time this is caused by a harmful script somewhere in the net. But, how to find out which of the sites is the affected one? Looking at top/ps doesn’t helps much if PHP (f.ex.) is running as a apache module. You will only see a lot of “httpd” processes.
A good tool to get closer to it is apachetop. It takes the access.log as argument and shows you all accessed pages, hosts and more:
apachetop -f /var/log/httpd/access_log
Cool, but… What if you’re using Plesk? It stores the access_log of each website in a separate file within the corresponding vhost directory. The default access_log doesn’t help, as long the problem is not related to webmail for example.
You can add multiple “-f” arguments to apachetop manually. But if you have 300+ vhosts? Not really. Luckily we’re on Linux and can do something like this:
apachetop $(find /var/www/vhosts/*/statistics/logs/ -name "access_log" -print | sed 's/^/-f '/)
This adds all access-logs within our vhosts directory as arguments. Unfortunately, it fails:
Only 50 files are supported at the moment Segmentation fault
OK, how we can limit the number of files passed to apachetop? Because we’re searching for a lot of request, we can assume the logfile already has some size. Most of our customer sites have a very low load anyway or are used for mail only. So, let us extend the used find command a bit:
apachetop $(find /var/www/vhosts/*/statistics/logs/ -type f -size +10k -name "access_log" -print | sed 's/^/-f '/)
Now, only logs are passed to apachetop which are bigger than 10 kilobytes. You can adjust it as needed. “c” => Bytes, “k” => Kilobytes “M” => Megabytes, “G” => Gigabytes.
Now we see something like this:
last hit: 17:56:25 atop runtime: 0 days, 00:24:25 17:56:35</pre> All: 747 reqs ( 0.5/sec) 14.5M ( 10.2K/sec) 19.9K/req 2xx: 657 (88.0%) 3xx: 42 ( 5.6%) 4xx: 44 ( 5.9%) 5xx: 4 ( 0.5%) R ( 30s): 40 reqs ( 1.3/sec) 464.6K ( 15.5K/sec) 11.6K/req 2xx: 38 (95.0%) 3xx: 1 ( 2.5%) 4xx: 1 ( 2.5%) 5xx: 0 ( 0.0%) REQS REQ/S KB KB/S URL 2 0.09 21.0 1.0*/plugins/system/yoo_effects/yoo_effects.js.php 1 0.04 0.5 0.0 /index.php 1 0.05 6.7 0.3 / 1 0.05 10.9 0.5 /templates/mobile_elegance/jqm/jquery.mobile-1.2.0.min.css 1 0.05 1.4 0.1 /media/zoo/assets/css/reset.css 1 0.05 0.5 0.0 /media/zoo/applications/product/templates/default/assets/css/zoo.css 1 0.05 1.0 0.0 /plugins/system/yoo_effects/lightbox/shadowbox.css 1 0.05 1.8 0.1 /components/com_rsform/assets/calendar/calendar.css 1 0.05 0.7 0.0 /components/com_rsform/assets/css/front.css 1 0.05 5.2 0.2 /components/com_rsform/assets/js/script.js 1 0.05 0.5 0.0 /components/com_zoo/assets/js/default.js
Missing something? Yes, the domain…
The access_log doesn’t contains the domain of the vhost itself, just the path to the file. But maybe enough to find out which site is affected.
If you’re pressing the key “d” one time, you can switch to hosts view. Maybe there is one single IP the all the requests are coming from. If so, you can simply block this IP for some time.
Or, if you could identify the IP address, you can grep for it within all access_logs with (not tried with 50+ files, but think it should work):
grep "18.104.22.168" /var/www/vhosts/*/statistics/logs/access_log