Merging Multiple Apache Access Logs Into One Overall Access Log
Version 1.0
Author: Falko Timme
Let's assume you have a web application that runs of a cluster of Apache nodes. Each node generates its own Apache access log from which you can generate page view statistics with tools such as Webalizer or AWStats. Obviously you do not want to have page view statistics for each Apache node, but overall page view statistics. To achieve this, we must merge the access logs from each node into one overall access log that we can then feed into Webalizer or AWstats. There is a Perl script called logresolvemerge.pl (part of the AWStats package) that can do this for us.
I do not issue any guarantee that this will work for you!
1 Preliminary Note
I have tested this on a Debian system, but the procedure is the same on every other distribution except for the package installation. Use your distribution's package manager (e.g. apt, yum, yast, urpmi) to install the packages.
I'm assuming that you are using a single host (typically this is the host where you run Webalizer or AWStats to generate the statistics) to collect the access logs from the Apache nodes (I don't cover how to transfer the access logs from the Apache nodes to the host where we collect the logs - you could do that with rsync, for example, as shown in this tutorial: Mirror Your Web Site With rsync) - I'm using the directory /var/log/webcluster here to store the access logs of the Apache nodes.
2 Installing logresolvemerge.pl
As I said before, logresolvemerge.pl is a part of the AWStats package, so we install that now (even if you don't want to use it - you can still use Webalizer if you like):
apt-get install awstats
logresolvemerge.pl is now located in the /usr/share/doc/awstats/examples directory. Let's move it to /usr/local/bin so that it is in our PATH:
mv /usr/share/doc/awstats/examples/logresolvemerge.pl /usr/local/bin
(If you can't find your logresolvemerge.pl, you can search for it like this:
updatedb
locate logresolvemerge.pl
and then move it to /usr/local/bin.)
3 Using logresolvemerge.pl
Let's assume we have two access logs in /var/log/webcluster, /var/log/webcluster/access_log_server1 and /var/log/webcluster/access_log_server2.
I'm using two sample access logs here to demonstrate how logresolvemerge.pl works. Both access logs contain the same data for demonstration purposes, but the dates of the requests differ by one second on server1 and server2. After we have merged both logs with logresolvemerge.pl, the overall access log should contain the requests in chronological order so that Webalizer and AWStats can process them correctly.
/var/log/webcluster/access_log_server1 looks like this:
cat /var/log/webcluster/access_log_server1
192.6.178.101 - - [17/Oct/2007:18:50:27 +0200] "GET /themes/htf_glass/images/header_tab6.png HTTP/1.0" 200 7434 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 72.149.148.248 - - [17/Oct/2007:18:50:29 +0200] "GET /misc/menu-leaf.png HTTP/1.1" 200 108 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" 24.8.231.74 - - [17/Oct/2007:18:50:31 +0200] "GET /forums/showthread.php?t=15338 HTTP/1.0" 200 59342 "https://www.howtoforge.com/forums/search.php?searchid=624841" "Wget/1.10.2" 24.127.251.14 - - [17/Oct/2007:18:50:33 +0200] "POST /mailgust/index.php HTTP/1.1" 200 1662 "https://www.howtoforge.com/mailgust/index.php?method=login_form&list=maillistuser" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.30)" 72.149.148.248 - - [17/Oct/2007:18:50:35 +0200] "GET /images/print.gif HTTP/1.1" 200 217 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" 192.6.178.101 - - [17/Oct/2007:18:50:37 +0200] "GET /themes/htf_glass/images/howtoforge_logo_trans.gif HTTP/1.0" 200 184 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 76.22.105.74 - - [17/Oct/2007:18:50:39 +0200] "GET /themes/htf_glass/images/search_small.gif HTTP/1.1" 200 1367 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Debian)" 192.6.178.101 - - [17/Oct/2007:18:50:41 +0200] "GET /themes/htf_glass/images/join_small.gif HTTP/1.0" 200 1212 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 192.6.178.101 - - [17/Oct/2007:18:50:43 +0200] "GET /forums/clientscript/vbulletin_md5.js HTTP/1.0" 200 9661 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 24.127.251.14 - - [17/Oct/2007:18:50:45 +0200] "GET /mailgust/i/menushadow.gif HTTP/1.1" 200 59 "https://www.howtoforge.com/mailgust/index.php" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.30)"
/var/log/webcluster/access_log_server2 looks like this:
cat /var/log/webcluster/access_log_server2
192.6.178.101 - - [17/Oct/2007:18:50:28 +0200] "GET /themes/htf_glass/images/header_tab6.png HTTP/1.0" 200 7434 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 72.149.148.248 - - [17/Oct/2007:18:50:30 +0200] "GET /misc/menu-leaf.png HTTP/1.1" 200 108 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" 24.8.231.74 - - [17/Oct/2007:18:50:32 +0200] "GET /forums/showthread.php?t=15338 HTTP/1.0" 200 59342 "https://www.howtoforge.com/forums/search.php?searchid=624841" "Wget/1.10.2" 24.127.251.14 - - [17/Oct/2007:18:50:34 +0200] "POST /mailgust/index.php HTTP/1.1" 200 1662 "https://www.howtoforge.com/mailgust/index.php?method=login_form&list=maillistuser" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.30)" 72.149.148.248 - - [17/Oct/2007:18:50:36 +0200] "GET /images/print.gif HTTP/1.1" 200 217 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" 192.6.178.101 - - [17/Oct/2007:18:50:38 +0200] "GET /themes/htf_glass/images/howtoforge_logo_trans.gif HTTP/1.0" 200 184 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 76.22.105.74 - - [17/Oct/2007:18:50:40 +0200] "GET /themes/htf_glass/images/search_small.gif HTTP/1.1" 200 1367 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Debian)" 192.6.178.101 - - [17/Oct/2007:18:50:42 +0200] "GET /themes/htf_glass/images/join_small.gif HTTP/1.0" 200 1212 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 192.6.178.101 - - [17/Oct/2007:18:50:44 +0200] "GET /forums/clientscript/vbulletin_md5.js HTTP/1.0" 200 9661 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 24.127.251.14 - - [17/Oct/2007:18:50:46 +0200] "GET /mailgust/i/menushadow.gif HTTP/1.1" 200 59 "https://www.howtoforge.com/mailgust/index.php" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.30)"
To merge both access logs into the overall access log /var/log/webcluster/access_log_overall, we run:
logresolvemerge.pl /var/log/webcluster/access_log_server* > /var/log/webcluster/access_log_overall
If you get an error message like this:
server1:~# logresolvemerge.pl /var/log/webcluster/access_log_server* > /var/log/webcluster/access_log_overall
-bash: /usr/local/bin/logresolvemerge.pl: /usr/bin/perl^M: bad interpreter: No such file or directory
server1:~#
this means that logresolvemerge.pl contains Windows linebreaks. To fix this, we can convert them to Unix linebreaks with the tool dos2unix which is part of the sysutils package on Debian. Therefore we install the sysutils package now:
apt-get install sysutils
Afterwards, we fix the logresolvemerge.pl script:
dos2unix /usr/local/bin/logresolvemerge.pl /usr/local/bin/logresolvemerge.pl
Now you can run
logresolvemerge.pl /var/log/webcluster/access_log_server* > /var/log/webcluster/access_log_overall
again, and this time there should be no errors.
(If you don't want to use wildcards, but specify each single access log, you can use logresolvemerge.pl like this with the same result:
logresolvemerge.pl /var/log/webcluster/access_log_server1 /var/log/webcluster/access_log_server2 > /var/log/webcluster/access_log_overall
)
Now you should have an overall access log, /var/log/webcluster/access_log_overall, which should contain the requests of /var/log/webcluster/access_log_server1 and /var/log/webcluster/access_log_server2 in chronological order:
cat /var/log/webcluster/access_log_overall
192.6.178.101 - - [17/Oct/2007:18:50:27 +0200] "GET /themes/htf_glass/images/header_tab6.png HTTP/1.0" 200 7434 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 192.6.178.101 - - [17/Oct/2007:18:50:28 +0200] "GET /themes/htf_glass/images/header_tab6.png HTTP/1.0" 200 7434 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 72.149.148.248 - - [17/Oct/2007:18:50:29 +0200] "GET /misc/menu-leaf.png HTTP/1.1" 200 108 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" 72.149.148.248 - - [17/Oct/2007:18:50:30 +0200] "GET /misc/menu-leaf.png HTTP/1.1" 200 108 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" 24.8.231.74 - - [17/Oct/2007:18:50:31 +0200] "GET /forums/showthread.php?t=15338 HTTP/1.0" 200 59342 "https://www.howtoforge.com/forums/search.php?searchid=624841" "Wget/1.10.2" 24.8.231.74 - - [17/Oct/2007:18:50:32 +0200] "GET /forums/showthread.php?t=15338 HTTP/1.0" 200 59342 "https://www.howtoforge.com/forums/search.php?searchid=624841" "Wget/1.10.2" 24.127.251.14 - - [17/Oct/2007:18:50:33 +0200] "POST /mailgust/index.php HTTP/1.1" 200 1662 "https://www.howtoforge.com/mailgust/index.php?method=login_form&list=maillistuser" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.30)" 24.127.251.14 - - [17/Oct/2007:18:50:34 +0200] "POST /mailgust/index.php HTTP/1.1" 200 1662 "https://www.howtoforge.com/mailgust/index.php?method=login_form&list=maillistuser" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.30)" 72.149.148.248 - - [17/Oct/2007:18:50:35 +0200] "GET /images/print.gif HTTP/1.1" 200 217 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" 72.149.148.248 - - [17/Oct/2007:18:50:36 +0200] "GET /images/print.gif HTTP/1.1" 200 217 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" 192.6.178.101 - - [17/Oct/2007:18:50:37 +0200] "GET /themes/htf_glass/images/howtoforge_logo_trans.gif HTTP/1.0" 200 184 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 192.6.178.101 - - [17/Oct/2007:18:50:38 +0200] "GET /themes/htf_glass/images/howtoforge_logo_trans.gif HTTP/1.0" 200 184 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 76.22.105.74 - - [17/Oct/2007:18:50:39 +0200] "GET /themes/htf_glass/images/search_small.gif HTTP/1.1" 200 1367 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Debian)" 76.22.105.74 - - [17/Oct/2007:18:50:40 +0200] "GET /themes/htf_glass/images/search_small.gif HTTP/1.1" 200 1367 "https://www.howtoforge.com/the_perfect_desktop_mandriva_2008.0" "Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Debian)" 192.6.178.101 - - [17/Oct/2007:18:50:41 +0200] "GET /themes/htf_glass/images/join_small.gif HTTP/1.0" 200 1212 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 192.6.178.101 - - [17/Oct/2007:18:50:42 +0200] "GET /themes/htf_glass/images/join_small.gif HTTP/1.0" 200 1212 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 192.6.178.101 - - [17/Oct/2007:18:50:43 +0200] "GET /forums/clientscript/vbulletin_md5.js HTTP/1.0" 200 9661 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 192.6.178.101 - - [17/Oct/2007:18:50:44 +0200] "GET /forums/clientscript/vbulletin_md5.js HTTP/1.0" 200 9661 "https://www.howtoforge.com/data_recovery_with_testdisk" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.0.3705; .NET CLR 2.0.50727)" 24.127.251.14 - - [17/Oct/2007:18:50:45 +0200] "GET /mailgust/i/menushadow.gif HTTP/1.1" 200 59 "https://www.howtoforge.com/mailgust/index.php" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.30)" 24.127.251.14 - - [17/Oct/2007:18:50:46 +0200] "GET /mailgust/i/menushadow.gif HTTP/1.1" 200 59 "https://www.howtoforge.com/mailgust/index.php" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Avant Browser; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; .NET CLR 3.0.04506.30)"
4 Creating A Cron Job For logresolvemerge.pl
Of course, you don't want to run logresolvemerge.pl manually each day; therefore we create a cron job right now:
crontab -e
If you'd like to run logresolvemerge.pl each night a 4:00h, your cron job could look like this (adjust the time to your needs):
0 4 * * * /usr/local/bin/logresolvemerge.pl /var/log/webcluster/access_log_server* > /var/log/webcluster/access_log_overall |
5 Links
- AWStats: http://awstats.sourceforge.net
- Webalizer: http://www.mrunix.net/webalizer
- Debian: http://www.debian.org