Caching With Apache's mod_cache On Debian Lenny

Want to support HowtoForge? Become a subscriber!
 
Submitted by falko (Contact Author) (Forums) on Fri, 2010-08-06 14:39. :: Debian | Web Server | Apache

Caching With Apache's mod_cache On Debian Lenny

Version 1.0
Author: Falko Timme <ft [at] falkotimme [dot] com>
Follow me on Twitter
Last edited 04/21/2010

This article explains how you can cache your web site contents with Apache's mod_cache on Debian Lenny. If you have a high-traffic dynamic web site that generates lots of database queries on each request, you can decrease the server load dramatically by caching your content for a few minutes or more (that depends on how often you update your content).

I do not issue any guarantee that this will work for you!

 

1 Preliminary Note

I'm assuming that you have a working Apache2 setup (Apache 2.2.x - prior to that version, mod_cache is considered experimental) from the Debian repositories - the Apache version in the Debian Lenny repositories is 2.2.9 so you should be good to go.

I'm using the document root /var/www here for my test vhost - you must adjust this if your document root differs.

 

2 Enabling mod_cache

mod_cache has two submodules that manage the cache storage, mod_disk_cache (for storing contents on the hard drive) and mod_mem_cache (for storing contents in memory which is faster than disk caching). Decide which one you want to use and continue either with chapter 2.1 (mod_disk_cache) or 2.2 (mod_mem_cache).

 

2.1 mod_disk_cache

The mod_disk_cache configuration is stored in /etc/apache2/mods-available/disk_cache.conf, so let's edit that one:

vi /etc/apache2/mods-available/disk_cache.conf

Make sure you uncomment the CacheEnable disk / line, so that the minimal configuration looks as follows:

<IfModule mod_disk_cache.c>
# cache cleaning is done by htcacheclean, which can be configured in
# /etc/default/apache2
#
# For further information, see the comments in that file,
# /usr/share/doc/apache2.2-common/README.Debian, and the htcacheclean(8)
# man page.

        # This path must be the same as the one in /etc/default/apache2
        CacheRoot /var/cache/apache2/mod_disk_cache

        # This will also cache local documents. It usually makes more sense to
        # put this into the configuration for just one virtual host.

        CacheEnable disk /

        CacheDirLevels 5
        CacheDirLength 3
</IfModule>

You can find explanations for these configuration options and further configuration options on http://httpd.apache.org/docs/2.2/mod/mod_disk_cache.html.

Now we can enable mod_cache and mod_disk_cache:

a2enmod cache
a2enmod disk_cache

/etc/init.d/apache2 restart

To make sure that our cache directory /var/cache/apache2/mod_disk_cache doesn't fill up over time, we have to clean it with the htcacheclean command. That command is part of the apache2-utils package which we install as follows:

aptitude install apache2-utils

Afterwards, we can start htcacheclean as a daemon like this:

htcacheclean -d30 -n -t -p /var/cache/apache2/mod_disk_cache -l 100M -i

This will clean our cache directory every 30 minutes and make sure that it will not get bigger than 100MB. To learn more about htcacheclean, take a look at

man htcacheclean

Of course, you don't want to start htcacheclean manually each time you reboot the server - therefore we edit /etc/rc.local...

vi /etc/rc.local

... and add the following line to it, right before the exit 0 line:

[...]
/usr/sbin/htcacheclean -d30 -n -t -p /var/cache/apache2/mod_disk_cache -l 100M -i
[...]

This will start htcacheclean automatically each time you start the server.

 

2.2 mod_mem_cache

The mod_mem_cache configuration is located in /etc/apache2/mods-available/mem_cache.conf:

vi /etc/apache2/mods-available/mem_cache.conf

<IfModule mod_mem_cache.c>
        CacheEnable mem /
        MCacheSize 4096
        MCacheMaxObjectCount 100
        MCacheMinObjectSize 1
        MCacheMaxObjectSize 2048
</IfModule>

This is the default configuration - if you like you can modify it. A list of configuration directives for mod_mem_cache is available here: http://httpd.apache.org/docs/2.2/mod/mod_mem_cache.html

Now let's enable mod_cache and mod_mem_cache as follows:

a2enmod cache
a2enmod mem_cache

/etc/init.d/apache2 restart

That's it already! With mod_mem_cache, you don't have to clean up any cache directories.

 

3 Testing

Unfortunately mod_cache doesn't provide any logging functionalities which is bad if you want to know if logging is working. Therefore I create a small PHP test file, /var/www/cachetest.php, that sends out HTTP headers that tell mod_cache that it should cache the file for 300 seconds, and that simply prints the timestamp:

vi /var/www/cachetest.php

<?php
header("Cache-Control: must-revalidate, max-age=300");
header("Vary: Accept-Encoding");
echo time()."<br>";
?>

Now call that file in a browser - it should display the current time stamp. Then click in the browser's address bar and press ENTER so that the page gets loaded again (don't press F5 or the reload button - this will always fetch a fresh copy from the server instead of the cache!) - if all goes well, you should still see the old, cached timestamp. If you wait 300 seconds, you should get a fresh copy from the server instead of the cache.

 

4 HTTP Headers

Caching doesn't work out-of-the-box - you must modify your web application so that caching can work (it is possible that your web application already supports caching - please consult the documentation of your application to find out). mod_cache will cache web pages only if the HTTP headers sent out by your web application tell it to do so.

Here are some examples of headers that tell mod_cache not to cache:

  • Expires headers with a date in the past: "Expires: Sun, 19 Nov 1978 05:00:00 GMT"
  • Certain Cache-Control headers: "Cache-Control: no-store, no-cache, must-revalidate" or "Cache-Control: must-revalidate, max-age=0"
  • Set-Cookie headers: a page will not be cached if a cookie is set.

So if you want mod_cache to cache your pages, modify your application to not send out such headers.

If you want mod_cache to cache your pages, you can set an Expires header with a date in the future, but the recommended way is to use max-age:

"Cache-Control: must-revalidate, max-age=300"

This tells mod_cache to cache the page for 300 seconds (max-age) - unfortunately mod_cache doesn't know the s-maxage option (see http://www.mnot.net/cache_docs/#CACHE-CONTROL), that's why we must use the max-age option (which also tells your browser to cache - please keep this in mind if you get unexpected results!). If mod_cache knew the s-maxage option, we could use "Cache-Control: must-revalidate, max-age=0, s-maxage=300" which would tell mod_cache, but not the browser, to cache the page.

Of course, this header is useless if you send out one of the non-caching headers (Expires in the past, Set-Cookie, etc.) from above at the same time!

Another very important header for caching is this one:

"Vary: Accept-Encoding"

This makes mod_cache keep two copies of each cached page, one compressed (gzip) and one uncompressed so that it can deliver the right version depending on the capabilities of the user-agent/browser. Some user-agents don't understand gzip compression, so they should get the uncompressed version.

So here's the summary: use the following two headers if you want mod_cache to cache:

"Cache-Control: must-revalidate, max-age=300"
"Vary: Accept-Encoding"

and make sure that no Expires with a date in the past, cookies, etc. are sent.

If your application is written in PHP, you can use PHP's header() function to send out HTTP headers, e.g. like this:

header("Cache-Control: must-revalidate, max-age=300");
header("Vary: Accept-Encoding");

This page is a must-read if you want to learn more about HTTP headers and caching: http://www.mnot.net/cache_docs/

 

5 Links


Please do not use the comment function to ask for help! If you need help, please use our forum.
Comments will be published after administrator approval.
Submitted by Anonymous (not registered) on Sat, 2010-11-27 15:37.
You don't say where to get mod_cache - the only pointer is to a documentation page at apache.org, which also doesn't say where to get the software.