How to install mod_gzip
Author: Michael Schröpl
How to install mod_gzip
Actually performing the installation isn't difficult, but finding the method that suits best to the needs of your Apache installation may take some time.
Therefore it is highly recommended that you read this chapter completely and become aware of the pros and cons of the different options before you select the operation method and perform the installation.
The document in hand especially covers the internal processing model of mod_gzip as an Apache module and may thus provide informations that can be helpful for understanding mod_gzip's evaluation method for configuration directives.
The Apache web server supports two different methods of integrating a module into its program code:
Depending on the operation concept used for your Apache server and the given requirements
Static integration means that the module becomes a permanent part of the linked program binary httpd which implements the Apache server.
For this the Apache server source code has to be
Normally each administrator may want to use a different set of features for his Apache server adapted to his own requirements; therefore it doesn't seem feasible to provide program files ready to run on a multitude of platforms for download.
Dynamic integration means that the module can be loaded from a separate module file as shared object when starting the Apache process.
Most Apache modules consist of one single source code file only. This file can be compiled by invoking the apxs program (with corresponding parameter values); for the installation of the shared object created this way one more invocation of apxs (with different parameter values) would be required.
From version 22.214.171.124a on, mod_gzip's source code is divided into three separate files:
This structure of the source code makes the mod_gzip maintenance a little easier - and the installation a little more complicated (as now several source files have to be compiled instead of just one). Therefore mod_gzip now provides Makefiles to simplify this installation process.
Depending on which of the operation concepts named above is to be run, different files (suitable for the respective purpose of use) are to be used.
As of this writing the following files are available for each mod_gzip version at the download page for the mod_gzip project:
If mod_gzip is to be run with dynamic integration on some other platform then the shared module file for this platform has to be created by the administrator.
The procedure for installing the Apache webserver on a UNIX machine documented by the Apache Group reads like this (in the short version):
If the Apache webserver has been created this way then the shipped modules normally become static parts of the created program file httpd in the Apache program directory - unless you specified something different in the parameters of the configure call.
(The actual call of configure may become very extensive, depending on the degree of deviation from the standard parameter values. I recommend storing this call itself in a small shell script to document the type of processed installation by the way.)
The source code of the official Apache modules is contained in the src/modules subdirectory of the unpacked tar archive of the Apache software.
To let mod_gzip be treated like a standard Apache module by this mechanism, the following preparations are necessary:
As next step, you extend the configure call by the parameter --activate-module=src/modules/gzip/mod_gzip.c. Now the configure script will find the mod_gzip source code and create a suitable Makefile from the shipped file Makefile.tmpl - logged by the messages
+ activated gzip module (modules/gzip/mod_gzip.c)
Creating Makefile in src/modules/gzip
- the latter one just like for Apache's own modules. (The Makefile shipped with mod_gzip is not suitable for this type of installation - this one is only for the creation of a shared object file.)
Now the Apache installation will work as usual - and mod_gzip will be treated like a normal Apache module.
But configure knows that a module integrated via the --activate-module parameter is a 3rd-party module that may probably have specific requirements, and thus will load mod_gzip automatically on top of the module stack so that it will have access to the incoming HTTP request prior to all other modules - which is exactly what mod_gzip urgently needs.
On some platforms Apache's configure doesn't seem to automatically set the value of the $(LIBEXT) environment variable to the proper value of .a. In this case the compilation of mod_gzip will fail. The exact reason for this behaviour is unknown as of now; as workaround you may replace the line
within the shipped file Makefile.tmpl, i. e. insert the proper value manually.
Be sure to use an editor that doesn't expand tab characters to whitespaces for this task!
(To be tested: What happens when integrating more than one 3rd-party module with --activate-module? Is the order of the parameter values relevant in this case?)
To check whether mod_gzip actually has been integrated into the Apache program code as requested, the Apache server provides the httpd -l command. This will display a list of all integrated modules (in the order in which they will be loaded); mod_gzip.c should be the last entry displayed there.
The Apache webserver supports the concept of loadable modules.
Nearly each Apache module can be
The handling of loadable modules requires additional knowledge about the Apache configuration (because the order in which these modules are loaded may be significant for their functioning) but allows for changes of the Apache server's code range without having to recompile its source code.
On platforms like Windows (where not many Apache administrators have a C development environment at hand to compile and link the Apache code) the use of loadable modules may often be the only possibility to enlarge the functional scope of the Apache server.
The Apache 1.3 documentation provides the following articles about this topic:
To dynamically add the mod_gzip shared object to the Apache code, one of the following configuration directives is required:
The actual file name can be freely selected - it only has to match the name of the file effectively used. On the other hand, this name can depend on the operating system platform and even on the compilation method used for this module - in this case either the directive shown above has to be adapted or the file has to be renamed accordingly.
The Apache server can dynamically load any number of modules. While doing so, the corresponding LoadModule directives are processed in the order of their occurrence within the configuration file.
But the modules are loaded on a stack within the working memory: The module that has been loaded last will be the first to get access to handling the corresponding HTTP request to the Apache webserver - and may then decide whether to consider itself responsible for handling this request or not.
Only one of all modules in question can be responsible for handling a request in Apache 1.3 - subsequent modules will not even be asked.
Therefore, to be able to process the output of arbitrary modules, mod_gzip has to do something that actually contradicts the Apache 1.3 architecture: It has to 'handle' a request but subsequently revoke the responsibility for handling this request. Only by this procedure the module which is effectively responsible for handling the request can still be activated by the Apache server at all.
In this first phase the 'handling' of this request by mod_gzip does not mean to compress the page's content to be served - because this content doesn't even exist yet, it still has to be generated by another module! Instead, at this point in time mod_gzip just prepares to be asked again whether it wants to do anything after the page content's creation. Only in this second phase of its activation (where the content of the HTTP response is already available then) mod_gzip can perform its essential task, which is compressing the content of a HTTP response packet (and the modification of certain HTTP headers).
This 'registration' for later postprocessing the HTTP response performed by mod_gzip is necessary only if mod_gzip cannot already determine at this stage that it definitely won't be interested in processing the response content anyway.
Thus in this first phase mod_gzip already performs a part of the evaluation of the filter directives specified in the Apache configuration: It checks those rules where it can do this based upon the request description alone (i. e. the content of the corresponding HTTP headers). This applies to the mod_gzip_item_include/mod_gzip_item_exclude rules of the type
If the evaluation of these filter rules already proves that this request's result must not be compressed, i. e. if
then it is not necessary for mod_gzip to check the remaining rules after the creation of the response content - so this won't happen then, because mod_gzip remembers the result of the first evaluation phase for each request and terminates the second phase immediately in this case.
Otherwise in the second phase of its operation mod_gzip checks the remaining filter rules that can be evaluated only based on the actual content of the generated response packet:
Furthermore some other conditions are tested now, such as the size of the response packet (directives mod_gzip_minimum_file_size rsp. mod_gzip_maximum_file_size).
And only if all of these tests led to a positive result the compression of the response packed will actually be performed.
As to be able to perform all tasks described above, the mod_gzip module must have access to handling the HTTP request prior to each other Apache module whose output it is meant to handle. Because of the reversed order of access to handling a request for all Apache modules, mod_gzip should be loaded as the last one of all Apache modules.
For the static integration this module order is defined by the 'blueprint' of the httpd program during the compilation of the Apache source code. The procedure for compiling the Apache source code shipped by the Apache Group, activated by the configure shell script, knows all dependencies between the shipped modules (and ensures a corresponding order of these modules) but not the requirements of 3rd party modules like mod_gzip which are integrated into the compilation process by the configure parameter --add-module=file. To allow for a maximum of influence to these 3rd party modules such modules are loaded as last modules on the module stack.
So if mod_gzip is to be integrated into an Apache server as the only 3rd party module then configure automatically does the right thing. In case of using more than one 3rd party module the administrator is responsible for ordering these modules (maybe by the order of his --add-module= values? I didn't test this yet).
If an Apache server is operated to support the dynamic integration of modules(i. e. uses the mod_so module) then a utility program named apxs will be generated in Apache's bin directory during the Apache installation.
This programm allows its user to compile the program source code of an Apache module (using a C compiler) and to create a corresponding shared object file without requiring the complete source code of the Apache servers to be available: apxs knows all required Apache program interfaces and supplies the C compiler with the necessary information.
To save the user from finding out how exactly mod_gzip has to be compiled and installed completely when using apxs, a file named Makefile is provided within the source code archive.
Using this Makefile reduces the installation procedure to the following steps:
Besides these necessary steps the Makefile supports the following commands (which might rather be of interest to developers):