Apache provides several possibilities for the delivery of active contents. Active contents are HTML pages that are generated on the basis of variable input data from the client, such as search engines that respond to the input of one or several search strings (possibly interlinked with logical operators like AND or OR) by returning a list of pages containing these search strings.
Apache offers three ways of generating active contents:
These are directives that are embedded in an HTML page by means of special comments. Apache interprets the content of the comments and delivers the result as part of the HTML page.
These are programs that are located in certain directories. Apache forwards the parameters transmitted by the client to these programs and returns the output of the programs. This kind of programming is quite easy, especially since existing command-line programs can be designed in such a way that they accept input from Apache and return their output to Apache.
Apache offers interfaces for executing any modules within the scope of request processing. Apache gives these programs access to important information, such as the request or the HTTP headers. Programs can take part in the generation of active contents as well as in other functions, such as authentication. The programming of these modules requires some expertise. The advantages of this approach are high performance and possibilities that exceed those of SSI and CGI.
While CGI scripts are executed directly by Apache under the user ID of their owner, modules are controlled by a persistent interpreter that is embedded in Apache. In this way, separate processes do not need to be started and terminated for every request (this would result in a considerable overhead for the process management, memory management, etc.). Instead, the script is handled by the interpreter running under the ID of the Web server.
However, this approach has a catch. Compared to modules, CGI scripts are relatively tolerant of careless programming. With CGI scripts, errors, such as a failure to release resources and memory, do not have a lasting effect, because the programs are terminated after the request has been processed. This results in the clearance of memory that was not released by the program due to a programming error. With modules, the effects of programming errors accumulate, because the interpreter is persistent. If the server is not restarted and the interpreter runs for several months, the failure to release resources, such as database connections, can be quite disturbing.
Server-side includes (SSIs) are directives that are embedded in special
comments and executed by Apache. The result is
embedded in the output. For
example, the current date can be printed with
var="DATE_LOCAL" -->. The
# at the end of
the opening comment mark (
Apache that this is an SSI directive and not a
SSIs can be activated in several ways. The easiest approach is to search all executable files for SSIs. Another approach is to specify certain file types to search for SSIs. Both settings are explained in Section 126.96.36.199, “Server-Side Includes”.
CGI is the abbreviation for common gateway interface. With CGI, the server does not simply deliver a static HTML page, but executes a program that generates the page. This enables the generation of pages representing the result of a calculation, such as the result of the search in a database. By means of arguments passed to the executed program, the program can return an individual response page for every request.
The main advantage of CGI is that this technology is quite simple. The
program merely must exist in a specific directory to be executed by the
Web server just like a command-line program. The server sends the program
output on the standard output channel (
stdout) to the
Theoretically, CGI programs can be written in any programming language. Usually, scripting languages (interpreted languages), such as Perl or PHP, are used for this purpose. If speed is critical, C or C++ may be more suitable.
In the simplest case, Apache looks for
these programs in a specific directory (
directory can be set in the configuration file, described in
Section 30.6, “Configuration”).
If necessary, additional directories can be specified. In this
case, Apache searches these directories for
executable programs. However, this represents a security risk,
because any user
can let Apache execute programs,
some of which may be malicious. If executable programs are restricted to
cgi-bin, the administrator can easily see who places
which scripts and programs in this directory and check them for any
Input parameters can be passed to the server with
GET or POST. Depending on which
method is used, the server passes the parameters to the script in various
ways. With POST, the server passes the parameters to the
program on the standard input (stdin). The
program would receive its input in the same way when started from a
With GET, the server uses the
QUERY_STRING to pass the parameters to
Many modules are available for use with Apache. The term “module” is used in two different senses. First, there are modules that can be integrated in Apache to handle specific functions, such as the described modules for embedding programming languages.
Second, in connection with programming languages, modules refer to an independent group of functions, classes, and variables. These modules are integrated in a program to provide a certain functionality, such as the CGI modules available for all scripting languages. These modules facilitate the programming of CGI applications by providing various functions, such as methods for reading the request parameters and for the HTML output.
Perl is a popular, proven scripting language. There are numerous modules and libraries for Perl, including a library for expanding the Apache configuration file. A range of libraries for Perl is available in the Comprehensive Perl Archive Network (CPAN) at http://www.cpan.org/.
To set up mod_perl in SUSE LINUX, simply install
the respective package (see Section 30.5, “Installation”). Following the
installation, the Apache configuration file
includes the necessary entries (see
/etc/apache2/mod_perl-startup.pl). Information about
mod_perl is available at http://perl.apache.org/.
In the simplest case, run a previous CGI script as a mod_perl script by requesting it with a different URL. The configuration file contains aliases that point to the same directory and execute any scripts it contains either via CGI or via mod_perl. All these entries already exist in the configuration file. The alias entry for CGI is:
ScriptAlias /cgi-bin/ "/srv/www/cgi-bin/"
The entries for mod_perl are:
<IfModule mod_perl.c> # Provide two aliases to the same cgi-bin directory, # to see the effects of the 2 different mod_perl modes. # for Apache::Registry Mode ScriptAlias /perl/ "/srv/www/cgi-bin/" # for Apache::Perlrun Mode ScriptAlias /cgi-perl/ "/srv/www/cgi-bin/" </IfModule>
The following entries are also needed for mod_perl. These entries already exist in the configuration file.
# # If mod_perl is activated, load configuration information # <IfModule mod_perl.c> Perlrequire /usr/include/apache/modules/perl/startup.perl PerlModule Apache::Registry # # set Apache::Registry Mode for /perl Alias # <Location /perl> SetHandler perl-script PerlHandler Apache::Registry Options ExecCGI PerlSendHeader On </Location> # # set Apache::PerlRun Mode for /cgi-perl Alias # <Location /cgi-perl> SetHandler perl-script PerlHandler Apache::PerlRun Options ExecCGI PerlSendHeader On </Location> </IfModule>
These entries create aliases for the
modes. The difference between these two modes is:
All scripts are compiled and kept in a cache. Every script is applied as the content of a subroutine. Although this is good for performance, there is a disadvantage: the scripts must be programmed extremely carefully, because the variables and subroutines persist between the requests. This means that you must reset the variables to enable their use for the next request. If, for example, the credit card number of a customer is stored in a variable in an online banking script, this number could appear again when the next customer uses the application and requests the same script.
The scripts are recompiled for every request. Variables and
subroutines disappear from the namespace between the requests (the
namespace is the entirety of all variable names and routine names that
are defined at a given time during the existence of a script).
Apache::PerlRun does not necessitate
painstaking programming, because all variables are reinitialized when the
script is started and no values are kept from previous requests. For
Apache::PerlRun is slower than
Apache::Registry but still a lot faster
than CGI (despite some similarities to CGI), because no
separate process is
started for the interpreter.
PHP is a programming language that was especially developed for use with Web servers. In contrast to other languages whose commands are stored in separate files (scripts), the PHP commands are embedded in an HTML page (similar to SSI). The PHP interpreter processes the PHP commands and embeds the processing result in the HTML page.
The home page for PHP is http://www.php.net/. For PHP
to work, install
mod_php4-core and, in addition,
Python is an object-oriented programming language with a very clear
and legible syntax. An unusual but convenient feature is that the program
structure depends on the indentation. Blocks are not defined with braces
(as in C and Perl) or other demarcation elements, such as
begin and end, but by their level of
indentation. The package to install is
Ruby is a relatively new, object-oriented high-level programming
language that resembles certain aspects of Perl and Python and is ideal for
scripts. Like Python, it has a clean, transparent syntax. On the other
hand, Ruby has adopted abbreviations, such as
the number of the last line read in the
input file—a feature that is welcomed by some programmers and
abhorred by others. The basic concept of Ruby closely resembles that of