Generating Web Site Statistics With AWStats & JAWStats On Debian Lenny
Version 1.0
Author: Falko Timme
Follow me on Twitter
This tutorial explains how you can generate statistics for your web site with AWStats and JAWStats on a Debian Lenny web server. AWStats is a free powerful and featureful tool that generates advanced web server statistics. JAWStats runs in conjunction with AWStats and produces clear and informative charts, graphs and tables about your website visitors. AWStats is able to create graphical web pages for the statistics, but JAWStats presents this data in a much nicer way - it's much better organized and makes use of Ajax and Flash.
I do not issue any guarantee that this will work for you!
1 Preliminary Note
In this tutorial I have a web site www.example.com (with the aliases example.com, www.example.net, and example.net) with the document root /var/www/www.example.com/web.
2 Installing And Configuring AWStats
AWStats can be installed as follows:
aptitude install awstats
Its configuration is located in the /etc/awstats/ directory. For each virtual host we need to have a configuration file named awstats.<sitename>.conf in that directory (i.e, for our web site www.example.com we need the configuration file awstats.www.example.com.conf). We can use the /etc/awstats/awstats.conf file as a template:
cd /etc/awstats/
cp awstats.conf awstats.www.example.com.conf
vi awstats.www.example.com.conf
Modify the following settings:
[...] LogFile="/var/log/apache2/access.log" [...] LogFormat=1 [...] SiteDomain="www.example.com" [...] HostAliases="example.com www.example.net example.net" [...] |
LogFile must contain the path to the Apache access log of your virtual host or the overall Apache access log (the one for all sites; AWStats is able to filter out the records that don't belong to your web site). If you have a dynamic filename (e.g. because it contains a date, for example because your access log is created by cronolog or vlogger), you can use placeholders, e.g. like this:
LogFile="/var/log/httpd/access.log_%YYYY-0_%MM-0_%DD-0"
This is explained in the comments in the AWStats configuration file as follows:
"LogFile" contains the web, ftp or mail server log file to analyze.
Possible values: A full path, or a relative path from awstats.pl directory.
Example: "/var/log/apache/access.log"
Example: "../logs/mycombinedlog.log"
You can also use tags in this filename if you need a dynamic file name
depending on date or time (Replacement is made by AWStats at the beginning
of its execution). This is available tags :
%YYYY-n is replaced with 4 digits year we were n hours ago
%YY-n is replaced with 2 digits year we were n hours ago
%MM-n is replaced with 2 digits month we were n hours ago
%MO-n is replaced with 3 letters month we were n hours ago
%DD-n is replaced with day we were n hours ago
%HH-n is replaced with hour we were n hours ago
%NS-n is replaced with number of seconds at 00:00 since 1970
%WM-n is replaced with the week number in month (1-5)
%Wm-n is replaced with the week number in month (0-4)
%WY-n is replaced with the week number in year (01-52)
%Wy-n is replaced with the week number in year (00-51)
%DW-n is replaced with the day number in week (1-7, 1=sunday)
use n=24 if you need (1-7, 1=monday)
%Dw-n is replaced with the day number in week (0-6, 0=sunday)
use n=24 if you need (0-6, 0=monday)
Use 0 for n if you need current year, month, day, hour...
Example: "/var/log/access_log.%YYYY-0%MM-0%DD-0.log"
Example: "C:/WINNT/system32/LogFiles/W3SVC1/ex%YY-24%MM-24%DD-24.log"
You can also use a pipe if log file come from a pipe :
Example: "gzip -d </var/log/apache/access.log.gz |"
If there are several log files from load balancing servers :
Example: "/pathtotools/logresolvemerge.pl *.log |"
You're probably using Apache's combined log format, so you should use LogFormat=1 (again, take a look at the comments in the file to find out the right format, but in most cases you're using Apache's combined log format).
SiteDomain: Specify your web site's main domain (www.example.com in this case).
HostAliases: Specify all other domains/subdomains used to access your web site (example.com, www.example.net, example.net in this example).
Next we create a cron job to run AWStats every nine minutes:
crontab -e
9,19,29,39,49,59 * * * * /usr/lib/cgi-bin/awstats.pl -config=www.example.com -update >/dev/null |
(If you have a dynamic access log, as created by cronolog or vlogger, it's a good idea to include minute 59 in the cron job so that AWStats can process the current access log at 23:59h before a new access log is created at 0:00h - that way, you just lose the minute between 23:59h and 0:00h in your statistics.)
3 Installing And Configuring JAWStats
Go to http://www.jawstats.com/download, download the latest JAWStats version, unpack it on your PC, and upload it to a directory within your www.example.com web site, e.g. with FTP. In this tutorial I upload it to the /var/www/www.example.com/web/jawstats directory.
Afterwards, we must rename config.dist.php to config.php and modify it:
mv /var/www/www.example.com/web/jawstats/config.dist.php /var/www/www.example.com/web/jawstats/config.php
vi /var/www/www.example.com/web/jawstats/config.php
<?php // core config parameters $sDefaultLanguage = "en-gb"; $sConfigDefaultView = "thismonth.all"; $bConfigChangeSites = false; $bConfigUpdateSites = false; $sUpdateSiteFilename = "xml_update.php"; // individual site configuration $aConfig["www.example.com"] = array( "statspath" => "/var/lib/awstats/", "updatepath" => "/usr/lib/cgi-bin/", "siteurl" => "http://www.example.com", "sitename" => "My Example.com Web Site", "theme" => "default", "fadespeed" => 250, "password" => "secret", "includes" => "", "language" => "en-gb" ); ?> |
If you want to remove the "change site" link, change $bConfigChangeSites to false.
If you don't want your users to be able to update statistics themselves, set $bConfigUpdateSites to false.
After that, we have the array $aConfig["site1"] - rename it so that it is named after your site ($aConfig["www.example.com"]). Set statspath to /var/lib/awstats/ (don't forget the trailing slash!), updatepath to /usr/lib/cgi-bin/, siteurl to http://www.example.com, and specify the name of your web site under sitename. A password is needed only if you have set $bConfigUpdateSites to true (if you allow your users to update statistics over the browser, they will have to type in this password).
That's it - after the AWStats cron job has run for the first time (which can take a long time for web sites with lots of traffic, so be patient), you can access your statistics under http://www.example.com/jawstats.
Here are a few screenshots of what it can look like:
4 Password-Protect The JAWStats Output Directory (Optional)
Now it is a good idea to password-protect the directory /var/www/www.example.com/web/jawstats unless you want everybody to be able to access your web site statistics.
To do this, we create an .htaccess file in /var/www/www.example.com/web/jawstats:
vi /var/www/www.example.com/web/jawstats/.htaccess
AuthType Basic AuthName "Members Only" AuthUserFile /var/www/www.example.com/.htpasswd <limit GET PUT POST> require valid-user </limit> |
Then we must create the password file /var/www/www.example.com/.htpasswd. We want to log in with the username admin, so we do this:
htpasswd -c /var/www/www.example.com/.htpasswd admin
Enter a password for admin, and you're done!
5 Links
- AWStats: http://www.awstats.org/
- JAWStats: http://www.jawstats.com/
- Apache: http://httpd.apache.org/
- Debian: http://www.debian.org/