next up previous contents index
Next: File Synchronization Up: Network Previous: Linux in the Network   Contents   Index

Subsections


The Apache Web Server

Basics

Web Server

A web server issues HTML pages requested by a client. These pages can be stored in a directory (passive or static pages) or be generated in response to a query (active contents).

HTTP

The clients are usually web browsers, like Konqueror or Mozilla. Communication between the browser and the web server takes place by way of the HyperText Transfer Protocol (HTTP). The current version HTTP 1.1 is documented in RFC 2068 and in the update RFC 2616. These RFCs are available at http://www.w3.org.

URLs

Clients use URLs to request pages from the server. For example, http://www.suse.com/index_us.html. A URL consists of:

  • A protocol. Frequently-used protocols:

    • http:// HTTP protocol
    • https:// Secure, encrypted version of HTTP
    • ftp:// File Transfer Protocol for uploading and downloading files

  • A domain, in this example, www.suse.com. The domain can be subdivided in two parts. The first part (www) points to a computer. The second part (suse.com) is the actual domain. Together, they are referred to as FQDN (Fully Qualified Domain Name).

  • A resource, in this example, index_us.html. This part specifies the full path to the resource. The resource can be a file, as in this example. However, it can also be a CGI script, a Java server page, or some other resource.

Various Internet mechanisms (such as the Domain Name System, DNS) convey the query to the domain, directing the access to one or several responsible computers. Apache delivers the actual resource (in this example, the page index_us.html) from its file directory. In this case, the file is located in the top level of the directory. However, resources can also be located in subdirectories as in the following:

http://www.suse.com/us/business/services/support/index.html

The file path is relative to the DocumentRoot, which can be changed in the configuration file. 14 describes how this is done.

Automatic Output of a Default Page

If no page is specified, Apache automatically appends one of the common names for such pages to the URL. The most frequently-used name for such pages is index.html. The activation of this functionality and the page names taken into consideration can be configured as described in 14. In this example, http://www.suse.comis sufficient to prompt the server to display the page http://www.suse.com/index_us.html.

What Is Apache?

The Most Popular Web Server

With a share of more than sixty percent (source: http://www.netcraft.com), Apache is the world's most widely-used web server. For web applications, Apache is often combined with Linux, the database MySQL, and the programming languages PHP and Perl. This combination is commonly referred to as LAMP.

Some of the strengths of Apache are as follows:


Expandability

By means of modules, Apache can be expanded with a wide range of functions. For example, Apache can execute CGI scripts in diverse programming languages by means of modules.

Apart from Perl and PHP, additional scripting languages, such as Python or Ruby, are also available. Furthermore, there are modules for secure data transmission (Secure Sockets Layer, SSL), user authentication, expanded logging, and other functions.

Customizability

By means of custom modules, Apache can be adapted to all kinds of requirements and preferences. Of course, this requires a certain amount of know-how.

Stability

As Apache is Open Source software. Its code has been screened for bugs and optimized by a large number of programmers. This approach ensures that Apache is largely free from errors (as far as this is possible for software). Nevertheless, the possibility that new security bugs may be detected in the future can never be fully excluded. 14 shows where to find information about security issues and how these can be eliminated.

Features

Apache supports a number of useful features, some of which are described below.


Virtual Hosts

Support for virtual hosts means that a single instance of Apache and a single machine can be used for several web sites. For the users, the web server appears as several independent web servers. The virtual hosts can be configured on different IP addresses or on the basis of names. Thus, you can save the acquisition costs and administration workload for additional machines.

Flexible URL Rewriting

Apache offers a number of possibilities for manipulating and rewriting URLs. Check the Apache documentation for details.


Content Negotiation

Apache can deliver a page that is adapted to the capabilities of the client (browser). For example, simple versions without frames can be delivered for older browsers or browsers that only operate in text mode (such as Lynx). In this way, the notorious JavaScript incompatibility of various browsers can be circumvented by delivering a special page version for every browser (provided you are prepared to adapt the JavaScript code for each individual browser).

Flexible Error Handling

You can react in a flexible way and provide a suitable response in the event of an error (such as nonexistent pages). The response can even be generated actively, for example, with CGI.

Basics

When Apache processes a query, several handlers can be specified for handling the query (by means of directives in the configuration file). These handlers can be part of Apache or a module invoked for processing the query. Thus, this procedure can be arranged in a very flexible way. Furthermore, special modules can be coupled with Apache for the purpose of processing requests.

The modularization has reached an advanced level especially in Apache 2, where almost everything except some minor tasks is handled by means of modules. In Apache 2, even HTTP is processed by way of modules. Accordingly, Apache 2 does not necessarily have to be a web server. It can also be used for completely different purposes with other modules. For example, there is a proof-of-concept mail server (POP3) module based on Apache.


Differences between Apache 1.3 and Apache 2

Overview

The main differences between Apache 1.3 and Apache 2 are as follows:

  • The way in which multiple queries are processed concurrently. In Apache 2, you can choose between threads and processes. The process management has been relocated to a separate module called the multiprocessing module (MPM). Depending on the MPM, Apache 2 responds to queries in different ways, with different effects on the performance and the use of modules. Details are provided below.

  • The innards of Apache have been thoroughly revised. Apache now uses a new, special base library (Apache Portable Runtime, APR) as the interface to system functions and for the memory management. Moreover, important and widespread modules such as mod_gzip (successor: mod_deflate) and mod_ssl, which have a profound impact on the processing of requests, are now integrated more fully in Apache.

  • Apache 2 now supports the Internet protocol of the future IPv6.

  • A new mechanism now enables manufacturers of modules to specify the desired loading sequence for modules. Thus, users no longer have to do this themselves. The order in which modules are executed is often significant. Previous, it was determined by means of the loading sequence. For example, a module that only gives authenticated users access to certain resources must be run first to prevent the pages from being displayed to users who do not have any access permissions.

  • Queries to and replies from Apache can be processed with filters.

  • Support for files that are larger than 2 GB (large file support, LFS) on 32-bit systems.

  • Some of the newer modules are only available for Apache 2.

  • Multilanguage error responses.

See also http://httpd.apache.org/docs-2.0/en/.

What is a Thread?

A thread is a ``light process.'' The advantage of a thread over a process is its lower resource consumption. Therefore, the use of threads instead of processes increases the performance. The disadvantage is that applications executed in a thread environment must be thread-safe. This means that:

  • Functions (or the methods in object-oriented applications) must be reentrant -- a function with the same input always delivers the same result, even if other threads concurrently execute the same function. Accordingly, functions must be programmed in such a way that they can be executed simultaneously by severals threads.

  • The access to resources (usually variables) must be arranged in such a way that concurrent threads do not conflict.

Threads and Processes

In contrast to Apache 1.3, which starts a separate process for every query, Apache 2 can handle queries as separate processes or in a mixed mode combining processes and threads. The MPM ``prefork'' is responsible for the execution as process. The MPM ``worker'' prompts the execution as thread. Select the MPM to use during the installation (see 14). The third mode -- ``perchild'' -- is not yet fully mature and is therefore not available for installation in SuSE Linux.

The difficulty with Apache 2 is that not all modules have been revised to be thread-safe. If you need a module that has not yet been adapted to threads, continue with Apache 1.3 or use Apache 2 with the MPM ``prefork''.

Conclusion

If you are satisfied with Apache 1.3 and the permanent availability of the web pages is vital, you can postpone the migration. Likewise, if you need modules that have not yet been adapted to Apache 2, you should also postpone the migration.

If increased performance is important or if you need one of the new features (such as filtering), you should consider a migration to Apache 2. Another argument in favor of Apache 2 is the availability of a YaST module that facilitates the configuration.

Whatever your decision may be, you should test your web site with Apache 2 in a test installation prior to the final launch.


Installation

Package Selection in YaST

For simple requirements, all you need is the Apache package. You can install the package apache (Apache 1.3) or the package apache2 (Apache 2). The pros and cons of these versions are covered in 14.

If you do not want or need the new features of Apache 2, it is recommended to install Apache 1.3 ( package apache).

If you decide to install the package apache2, you need one of the MPM (multiprocessing module) packages, such as the package apache2-prefork or the package apache2-worker. When selecting an MPM, remember that the thread-based worker MPM cannot be used with the package mod_php4, as some of the libraries of the package mod_php4 are not yet thread-safe.

Activating Apache

Even if Apache is installed, it will not be started automatically. To start Apache, activate it in the runlevel editor. To start it permanently when the system is booted, check the runlevels 3 and 5 in the runlevel editor. To check if Apache is active, enter http://localhost/ in a browser. If Apache is active, you will see an example page, provided the package apache-example-pages or the package apache2-example-pages is installed.

Modules for Active Contents

To use active contents with the help of modules, install the modules for the respective programming languages. These are the package mod_perl for Perl, the package mod_php4 for PHP, and the package mod_python for Python or the corresponding modules for Apache 2. The use of these modules is covered in 14.

Other Recommended Packages

Additionally, you should install the extensive documentation provided in the package apache-doc or the package apache2-doc. An alias (14 explains what an alias is) is available for the documentation, enabling you to access it with the URL http://localhost/manual following the installation.

To develop modules for Apache or compile third-party modules, install the package apache-devel or the package apache2-devel and the needed development tools. These include the apxs tools, which are described in 14.


Installation of Modules with apxs

apxs (or its Apache 2 equivalent apxs2) is an important tool for module developers. This program enables the compilation and installation of modules in source code with a single command (including the required changes to the configuration files). Furthermore, you can also install modules available as object files (extension .o) or static libraries (extension .a). From the sources, apxs creates a Dynamic Shared Object (DSO), which is directly used by Apache as module.

The installation of a module from source code can be performed with a command such as the following:

apxs -c -i -a mod_foo.c

Other options of apxs are described in the man page.

14 describes which packages must be installed to install the different versions of apxs.

apxs2 is available in several versions: apxs2, apxs2-prefork, and apxs2-worker. apxs2 installs modules in such a way that they can be used for all MPMs. The other two programs install modules in a way that they can only be used for the respective MPMs (``prefork'' or ``worker''). Thus, apxs2 installs modules in /usr/lib/apache2, and apxs2-prefork installs modules in /usr/lib/apache2-prefork.

The option -a should not be used with Apache 2, as this would cause the changes to be written directly to /etc/httpd/httpd.conf. Rather, modules should be activated by means of the entry APACHE_MODULES in /etc/sysconfig/apache2 as described in 14.


Configuration

Following the installation of Apache, additional changes are only necessary if you have special needs or preferences. In many cases, Apache can be used as it is.

Apache can be configured with SuSEconfig or by directly editing the file /etc/httpd/httpd.conf. If you want to edit /etc/httpd/httpd.conf directly, set the entry

ENABLE_SUSECONFIG_APACHE="yes"
in /etc/sysconfig/apache2 to no to prevent SuSEconfig from overwriting your changes in /etc/httpd/httpd.conf.


Configuration with SuSEconfig

The settings made in /etc/sysconfig/apache (and /etc/sysconfig/apache2) are applied to the Apache configuration files with SuSEconfig. The offered configuration options should be sufficient for most scenarios. The file provides explanatory comments for all variables.

Custom Configuration Files

Instead of performing changes directly in the configuration file /etc/httpd/httpd.conf, you can designate your own configuration file (such as httpd.conf.local) with the help of the variable APACHE_CONF_INCLUDE_FILES. Consequently, the file will be interpreted by the main configuration file. In this way, changes to the configuration will be retained even if the file /etc/httpd/httpd.conf is overwritten during a new installation.


Modules

Modules installed with YaST can be activated by setting the respective variable in /etc/sysconfig/apache to ``yes'' (Apache 1.3) or by including the name of module in the list specified for the variable APACHE_MODULES (Apache 2). This variable is located in the file /etc/sysconfig/apache2.


Flags

APACHE_SERVER_FLAGS can be used to specify flags that activate or deactivate certain sections of the configuration file. Thus, if a section in the configuration file is embraced with

<IfDefine someflag>
.
.
.
</IfDefine>

it will only be activated if the respective flag is set in ACTIVE_SERVER_FLAGS:

ACTIVE_SERVER_FLAGS = ... someflag ...

In this way, extensive sections of the configuration file can easily be activated or deactivated for test purposes.

Manual Configuration

The Configuration File

The configuration file /etc/httpd/httpd.conf (or /etc/apache2/httpd.conf) enables changes that are not possible by editing /etc/sysconfig/apache or /etc/sysconfig/apache2. The following sections describe some of the parameters that can be set in this file. The parameters are listed in the order in which they appear in this file.


DocumentRoot

One basic setting is the DocumentRoot, the directory under which Apache looks for web pages to be delivered by the server. For the default virtual host, it is set to /srv/www/htdocs. Normally, this setting does not need to be changed.

Timeout

Specifies the waiting period after which the server reports a time-out for a request.

MaxClients

The maximum number of clients Apache can handle concurrently. The default setting is 150, but this value may be too small for a heavily frequented web site. In Apache 1, this value is modified by SuSEconfig depending on the setting of the variable HTTPD_PERFORMANCE.

LoadModule

The LoadModule directives specify the modules to load. In Apache 1.3, the modules are loaded in the order specified in the LoadModule directives. In Apache 2, the loading sequence is determined by the modules themselves (see 14). These directives also specify the file containing the module.

Port

Specifies the port on which Apache waits for queries. Usually, this is port 80, the default port for HTTP. Normally, this setting should not be changed.

One reason for letting Apache listen to another port may be the test of a new version of a web site. In this way, the operational version of the web site continues to be accessible via default port 80.

Another reason may be that you merely want to make pages available on the intranet, as they contain information that are not intended for the public. For this purpose, you can set the port to a value like 8080 and block external access to this port by means of the firewall. In this way, the server can be protected against external access.


Directory

This directive can be used to set the access permissions and other permissions for a directory. A directive of this kind also exists for the DocumentRoot. The directory name specified here must be changed whenever the DocumentRoot is changed.


DirectoryIndex

Here, determine for which files Apache should search to complete a URL lacking a file specification. The default setting is index.html. For example, if the client requests the URL http://www.xyz.com/foo/barand the directory foo/bar containing a file called index.html exists under the DocumentRoot, Apache returns this page to the client.


AllowOverride

Every directory from which Apache delivers documents can contain a file that can override the global access permissions and other settings for this directory. These settings apply recursively to the current directory and its subdirectories, until they are overridden by another such file in a subdirectory. Accordingly, settings specified in a file in the DocumentRoot are applied globally. Normally, these files are called .htaccess. However, this can be changed (see 14).

Use AllowOverride to determine if the settings specified in the local files are allowed to override the global settings. Possible values are None, All, and any combination of Options, FileInfo, AuthConfig, and Limit. The meanings of these values are described in detail in the Apache documentation. The (safe) default setting is None.

Order

This option influences the sequence for the application of the settings for the access permissions Allow and Deny. The default setting is:

Order allow,deny

Accordingly, the access permissions for allowed accesses are applied first, followed by the access permissions for denied accesses.

The underlying approach is based on one of the following:

  • ``allow all'' (allow every access) with exceptions
  • ``deny all'' (deny every access) with exceptions

Example for the latter:

Order deny,allow
Deny from all
Allow from example.com
Allow from 10.1.0.0/255.255.0.0


AccessFileName

Here, set the name for the files that can override the global access permissions and other settings for directories delivered by Apache (see 14). The default setting is .htaccess.


ErrorLog

Specifies the name of the file in which Apache logs error messages. The default setting is /var/log/httpd/errorlog. Error messages for virtual hosts (see 14) are also logged in this file, unless a special log file was specified in the VirtualHost section of the configuration file.

LogLevel

Error messages are classified in various severity levels. This setting specifies the severity level from which error messages are logged. The specification of a level causes error messages of this and higher severity levels to be logged. The default setting is warn.

Alias

Using an alias, you can specify a shortcut for a directory that enables direct access to this directory. For example, the alias /manual/ enables access to the directory /srv/www/htdocs/manual even if the DocumentRoot is set to a directory other than /srv/www/htdocs (it does not make any difference if the DocumentRoot has this value). With this alias, http://localhost/manualenables direct access to the respective directory.

You may need to specify a Directory directive (see 14) determining the permissions for for the target directory specified in an Alias instruction.

ScriptAlias

This directive is similar to Alias. In addition, it indicates that the files in the target directory should be treated as CGI scripts.


Server-Side Includes

Server-side includes can be activated by searching all executable files for SSIs. This can be done with the following instruction:

<IfModule mod_include.c>
XBitHack on
</IfModule>

To search a file for SSIs, use the following command to make the file executable:

chmod +x <filename>

Alternatively, explicitly specify the file type to search for SSIs. This can be done with the following instruction:

AddType text/html .shtml
AddHandler server-parsed .shtml

It is not advisable to simply state .html, as this will cause Apache to search all pages for SSIs (even those that definitely do not contain any), which greatly impedes the performance.

In SuSE Linux, these two directives are already included in the configuration files, so normally no changes are necessary.


UserDir

With the help of the module mod_userdir and the directive UserDir, you can specify a directory in the home directory of the user in which the user can publish his files by way of Apache. This can be configured in SuSEconfig by means of the variable HTTPD_SEC_PUBLIC_HTML. To enable the publishing of files, this variable must be set to yes. This results in the following entry in the file /etc/httpd/suse_public_html.conf (which is interpreted by /etc/httpd/httpd.conf).

<IfModule mod_userdir.c>
    UserDir public_html
</IfModule>

Using Apache

Where Can I Place My Pages and Scripts?

To display your (static) web pages with Apache, simply place your files in the correct directory. In SuSE Linux, the correct directory is /srv/www/htdocs. A few small example pages may already be installed there. By means of these pages, check if Apache was installed correctly and is currently active. Subsequently, you can simply overwrite or uninstall these pages. Custom CGI scripts are installed in /srv/www/cgi-bin.

Apache Operating Status

During operation, Apache writes log messages to the file /var/log/httpd/access_log or /var/log/apache2/access_log. These messages show which resources were requested and delivered at what time and with which method (GET, POST...). Error messages are logged to the file /var/log/httpd/error_log (or to /var/log/apache2 in Apache 2).

Active Contents

Overview

Apache provides several possibilities for delivering active contents to clients. Active contents are HTML pages that are generated on the basis of variable input data of the client, such as search engines that respond to the input of one or several search strings (possibly interlinked with logical operators like ``and'' or ``or'') by returning a list of pages containing these search strings.

Apache offers three ways of generating active contents:

  • Server Side Includes (SSI). These are directives that are embedded in an HTML page by means of special comments. Apache interprets the content of the comments and delivers the result as part of the HTML page.

  • Common Gateway Interface (CGI). These are programs that are located in certain directories. Apache forwards the parameters transmitted by the client to these programs and returns the output of the programs. This kind of programming is quite easy, especially since existing command-line programs can be designed in such a way that they accept input from Apache and return their output to Apache.

  • Modules. Apache offers interfaces for executing any modules within the scope of the request processing. Apache gives these programs access to important information such as the request or the HTTP headers. Thus, programs can be integrated for the generation of active contents as well as for other functions (such as authentication).

    The progamming of such modules requires some expertise. The advantages of this approach are high performance and possibilities by far exceeding those of SSI and CGI.

Script Interpreter as Module versus CGI

Normally, CGI scripts are executed directly by Apache (similar to a command on the command line). In contrast, modules are controlled by a persistent interpreter that is embedded in Apache.

In this way, separate processes do not have to be started and terminated for every request (this would result in a considerable overhead for the process management, memory management, and so on). Rather, the script is handled by the running interpreter.

However, this approach has a catch. Compared to modules, CGI scripts are relatively tolerant towards careless programming. With CGI scripts, errors such as a failure to release resources and memory do not have a lasting effect, since the programs are terminated after the request has been processed. This results in the clearance of memory that was not released by the program due to a programming error.

With modules, the effects of programming errors accumulate, as the interpreter is persistent. If the server is not restarted and the interpreter runs for several months, the failure to release resources, such as database connections, can be quite disturbing.


SSI

Server-side includes are directives that are embedded in special comments and executed by Apache. The result is embedded in the output. For example, the current date can be printed with

<!--#echo var="DATE_LOCAL" -->

# at the end of the opening comment mark <!-- shows Apache that this is an SSI directive and not a simple comment.

SSIs can be activated in several ways. The easiest approach is to search all executable files for SSIs. Another approach is to specify certain file types to be searched for SSIs. Both settings are explained in 14.


CGI

What Is CGI?

CGI is the abbreviation for ``Common Gateway Interface''. With CGI, the server does not simply deliver a static HTML page, but executes a program that generates the page. This enables the generation of pages representing the result of a calculation, such as the result of the search in a database. By means of arguments passed to the executed program, the program can return an individual response page for every request.

Advantages of CGI

The main advantage of CGI is that this technology is quite simple. The program merely has to exist in a specific directory and is executed by the web server just like a command-line program. The server sends the program output on the standard output channel (stdout) to the client.

GET and POST

Input parameters can be passed to the server with GET or POST. Depending on which method is used, the server passes the parameters to the script in various ways. With POST, the server passes the parameters to the program on the standard input channel (stdin). (The program would receive its input in the same way when started from a console.)

With GET, the server uses the environment variable QUERY_STRING to pass the parameters to the program. An environment variable is a variable globally made available by the system (such as the variable PATH, which contains a list of paths the system searches for executable commands when the user enters a command).

Languages for CGI

Theoretically, CGI programs can be written in any programming language. Usually, scripting languages (interpreted languages), such as Perl or PHP, are used for this purpose. If speed is critical, C or C++ may be more suitable.

Where Are the Scripts Placed?

In the simplest case, Apache looks for these programs in a specific directory (cgi-bin). This directory can be set in the configuration file (see 14). If necessary, additional directories can be specified. In this case, Apache will search these directories for executable programs. However, this represents a security risk, as any user will be able to let Apache execute programs (some of which may be malicious). If executable programs are restricted to cgi-bin, the administrator can easily see who places which scripts and programs in this directory and check them for any malicious intent.


Generating Active Contents with Modules

Modules for Scripting Languages

A variety of modules is available for use with Apache.


Note


[Modules] The term ``module'' is used in two different senses.

First, there are modules that can be integrated in Apache for the purpose of handling specific functions, such as modules for embedding programming languages. These modules are introduced below.

Second, in connection with programming languages, modules refer to an independent group of functions, classes, and variables. These modules are integrated in a program to provide a certain functionality, such as the CGI modules available for all scripting languages. These modules facilitate the programming of CGI applications by providing various functions, such as methods for reading the request parameters and for the HTML output.

The following modules are available as packages in SuSE Linux.


mod_perl

General Information about Perl

Perl is a popular, proven scripting language. There are numerous modules and libraries for Perl, including a library for expanding the Apache configuration file. The home page for Perl is http://www.perl.com/. A range of libraries for Perl is available in the Comprehensive Perl Archive Network (CPAN) at http://www.cpan.org/.

Setting up mod_perl

To set up mod_perl in SuSE Linux, simply install the respective package (see 14). Following the installation, the needed entries will exist in the Apache configuration file (see /usr/include/apache/modules/perl/startup.perl for Apache 1 or /etc/apache2/mod_perl-startup.pl for Apache 2). Information about mod_perl is available at http://perl.apache.org/.

mod_perl versus CGI

In the simplest case, you can run a previous CGI script as a mod_perl script by requesting it with a different URL. The configuration file contains aliases that point to the same directory and execute any scripts it contains either via CGI or via mod_perl. All these entries already exist in the configuration file.

The alias entry for CGI is as follows:

ScriptAlias /cgi-bin/ "/srv/www/cgi-bin/"

The entries for mod_perl are as follows:

<IfModule mod_perl.c>
    # Provide two aliases to the same cgi-bin directory,
    # to see the effects of the 2 different mod_perl modes.
    # for Apache::Registry Mode
    ScriptAlias /perl/          "/srv/www/cgi-bin/"
    # for Apache::Perlrun Mode
    ScriptAlias /cgi-perl/      "/srv/www/cgi-bin/"
</IfModule>

The following entries are also needed for mod_perl. These entries already exist in the configuration file.

#
# If mod_perl is activated, load configuration information
#
<IfModule mod_perl.c>
Perlrequire /usr/include/apache/modules/perl/startup.perl
PerlModule Apache::Registry

#
# set Apache::Registry Mode for /perl Alias
#
<Location /perl>
SetHandler  perl-script
PerlHandler Apache::Registry
Options ExecCGI
PerlSendHeader On
</Location>

#
# set Apache::PerlRun Mode for /cgi-perl Alias
#
<Location /cgi-perl>
SetHandler  perl-script
PerlHandler Apache::PerlRun
Options ExecCGI
PerlSendHeader On
</Location>

</IfModule>

These entries create aliases for the Apache::Registry and Apache::PerlRun modes. The difference between these two modes is as follows:

  • With Apache::Registry, all scripts are compiled and kept in a cache. Every script is applied as the content of a subroutine.

    Although this is good for the performance, there is a disadvantage: the scripts must be programmed extremely carefully, as the variables and subroutines persist between the requests.

    This means that you have to reset the variables to enable their use for the next request. If, for example, the credit card number of a customer is stored in a variable in an online banking script, this number could appear again when the next customer uses the application and requests the same script.

  • Apache::PerlRun is more like CGI. The scripts are recompiled for every request. Thus, variables and subroutines disappear from the namespace between the requests. (The namespace is the entirety of all variable names and routine names that are defined at a given time during the existence of a script.)

    Therefore, Apache::PerlRun does not necessitate painstaking programming, as all variables are reinitialized when the script is started and no values are kept from previous requests.

    For this reason, Apache::PerlRun is slower than Apache::Registry but is still a lot faster than CGI, as no separate process is started for the interpreter.


mod_php4

PHP is a programming language that was especially developed for use with web servers. In contrast to other languages whose commands are stored in separate files (scripts), the PHP commands are embedded in an HTML page (similar to SSI). The PHP interpreter processes the PHP commands and embeds the processing result in the HTML page.

The home page for PHP is http://www.php.net/.

Packages: The package mod_php4-core must be installed. Additionally, the package mod_php4 is required for Apache 1 and the package apache2-mod_php4 for Apache 2.


mod_python

Python is an object-oriented programming language with a very clear and legible syntax. An unusual but convenient feature is that the program structure depends on the indention. Blocks are not defined with braces (as in C and Perl) or other demarcation elements (such as begin and end), but by their level of indention.

More information about this language is available at http://www.python.org/. For more information about mod_python, visit the URL http://www.modpython.org/.

The package to install is package mod_python or package apache2-mod_python.


mod_ruby

Ruby

Ruby is a relatively new, object-oriented high-level programming language that resembles certain aspects of Perl and Python and is ideal for scripts. Like Python, it has a clean, transparent syntax. On the other hand, Python has adopted abbreviations such as $.r for the number of the last line read in the input file -- a feature that is welcomed by some programmers and abhorred by others. The basic concept of Ruby closely resembles Smalltalk.

The home page of Ruby is http://www.ruby-lang.org/.

An Apache module is available for Ruby. The home page is http://www.modruby.net/.


Virtual Hosts


Overview: Virtual Hosts

Using virtual hosts, you can host several domains with a single web server. In this way, you can save the costs and administration workload for separate servers for each domain. Being one of the first web servers that offered this feature, Apache offers several possibilities for virtual hosts:

  • Name-based virtual hosts

  • IP-based virtual hosts

  • Operation of multiple instances of Apache on one machine

All three alternatives are introduced in the following paragraphs.

Name-Based Virtual Hosts

With name-based virtual hosts, one instance of Apache hosts several domains. You do not need to set up multiple IPs for a machine. This is the easiest, preferred alternative. Reasons against the use of name-based virtual hosts are covered in the Apache documentation.

The configuration takes place directly by way of the configuration file (/etc/httpd/httpd.conf). To activate name-based virtual hosts, a suitable directive must be specified:

NameVirtualHost *

The specification of * is sufficient to prompt Apache to accept all incoming requests.

Subsequently, the individual hosts must be configured:

<VirtualHost *>
    ServerName www.mycompany.com
    DocumentRoot /srv/www/htdocs/mycompany.com
    ServerAdmin webmaster@mycompany.com
    ErrorLog /var/log/httpd/www.my.company.com-error_log
    CustomLog /var/log/httpd/www.mycompany.com-access_log common
</VirtualHost>

<VirtualHost *>
    ServerName www.myothercompany.com
    DocumentRoot /srv/www/htdocs/myothercompany.com
    ServerAdmin webmaster@myothercompany.com
    ErrorLog /var/log/httpd/www.myothercompany.com-error_log
    CustomLog /var/log/httpd/www.myothercompany.com-access_log common
</VirtualHost>

In the following paragraphs, the path to the log files of Apache 2 should be changed from /var/log/httpd to /var/log/apache2.

A VirtualHost entry also must be configured for the domain the server originally hosted (www.mycompany.com). In this example, the original domain and one additional domain (www.myothercompany.com) are hosted on the same server.

Just as in NameVirtualHost, a * is also specified in the VirtualHost directives. Apache uses the host field in the HTTP header to connect the request with the virtual host. The request is forwarded to the virtual host whose ServerName matches the host name specified in this field.

For the directives ErrorLog and CustomLog, the log files do not have to contain the domain name. Here, you can use a name of your choice.

Serveradmin designates the e-mail address of the responsible person that can be contacted if problems arise. In the event of errors, Apache will state this address in the error messages it sends to the client.

IP-Based Virtual Hosts

Overview

This alternative requires the setup of multiple IPs for a machine. In this case, one instance of Apache hosts several domains, each of them assigned a different IP. The following example shows how Apache can be configured to host the original IP (192.168.1.10) plus two additional domains on additional IPs (192.168.1.20 and 192.168.1.21).

Of course, this particular example will only work on an intranet, as IPs ranging from 192.168.0.0 to 192.168.255.0 are not routed on the Internet.

Configuring IP Aliasing

For Apache to host multiple IPs, the underlying machine must accept requests for multiple IPs. This is called multi-IP hosting. For this purpose, IP aliasing must be activated in the kernel. This is the default setting in SuSE Linux.

Once the kernel has been configured for IP aliasing, the commands ifconfig and route can be used to set up additional IPs on the host. These commands must be executed as SuSE @nohyphen root. For the following example, we assume that the host already has its own IP (such as 192.168.1.10), which is assigned to the network device eth0.

Enter the command ifconfig to find out the IP of the host. Further IPs can be added with commands such as the following:

/sbin/ifconfig eth0:0 192.168.1.20
/sbin/ifconfig eth0:1 192.168.1.21

All these IPs will be assigned to the same physical network device (eth0).

Virtual Hosts with IPs

Once IP aliasing has been set up on the system or the host has been configured with several network cards, Apache can be configured. Specify a separate VirtualHost block for every virtual server:

<VirtualHost 192.168.1.20>
    ServerName www.myothercompany.com
    DocumentRoot /srv/www/htdocs/myothercompany.com
    ServerAdmin webmaster@myothercompany.com
    ErrorLog /var/log/httpd/www.myothercompany.com-error_log
    CustomLog /var/log/httpd/www.myothercompany.com-access_log common
</VirtualHost>

<VirtualHost 192.168.1.21>
    ServerName www.anothercompany.com
    DocumentRoot /srv/www/htdocs/anothercompany.com
    ServerAdmin webmaster@anothercompany.com
    ErrorLog /var/log/httpd/www.anothercompany.com-error_log
    CustomLog /var/log/httpd/www.anothercompany.com-access_log common
</VirtualHost>

VirtualHost directives are only specified for the additional domains. The original domain (www.mycompany.com) is configured with the respective settings (DocumentRoot, etc.) outside the VirtualHost blocks.

Multiple Instances of Apache

With the said methods for virtual hosts, the administrators of a domain can read the data of the other domains. To segregate the individual domains, start several instances of Apache that use separate settings for User, Group, and other variables in the configuration file.

In the configuration file, use the Listen directive to specify the IP handled by the respective Apache instance. For the above example, the directive for the first Apache instance would be as follows:

Listen 192.168.1.10:80

For the other two instances:

Listen 192.168.1.20:80

and

Listen 192.168.1.21:80


Security

Minimizing the Risk

If you do not need a web server on a machine, you should deactivate Apache in the runlevel editor, uninstall it, or refrain from installing it in the first place. To minimize the risk, deactivate all servers you do not need.

This especially applies to hosts used as firewalls. If possible, do not run any servers on these hosts.

Access Permissions

DocumentRoot Should Belong to SuSE @nohyphen root

By default, the DocumentRoot directory (/srv/www/htdocs) and the CGI directory belong to the user SuSE @nohyphen root. You should not change this setting. If the directories are writable for all, any user can place files into these directories. These files will be executed by Apache as the user SuSE @nohyphen wwwrun. Apache should not have any write permissions for the data and scripts it delivers. Therefore, these should not belong to the user SuSE @nohyphen wwwrun, but to another user (such as SuSE @nohyphen root).

To enable users to place files in the document directory of Apache, you should not make it writable for all, but rather create a subdirectory that is writable for all (such as /srv/www/htdocs/miscellaneous).

Publishing Documents from Home Directories

Another possibility to make sure that users can publish their files in the network is to specify a subdirectory of the user's home directory in the configuration file. The user can place his files for web presentation in this directory (e.g.,  /public_html). By default, this is activated in SuSE Linux. See 14 for details.

These web pages can be accessed by specifying the user in the URL. The URL contains the element ~username as a shortcut for the respective directory in the user's home directory. Example: Enter http://localhost/~tuxin a browser to list the files in the directory public_html in the home directory of the user newbie .

Stay Updated

If you operate a web server and especially if this web server is publicly accessible, you should always be informed about bugs and potential vulnerable spots. Sources for exploits and fixes are listed in 14.


Troubleshooting

What if Apache does not display a page correctly or not at all?

  • First, take a look at the error log and check if the messages it contains reveal the error. The general error log is located in /var/log/httpd/error_log or /var/log/apache2/error_log.

    A proven approach is to track the log files in a console to see how the server reacts to an access. This can be done by entering the following in a SuSE @nohyphen root console:

    tail -f /var/log/apache2/*_log
    

    This can also be quite informative and helpful when starting the server.

  • Check the online bug database at http://bugs.apache.org/.

  • Read the relevant mailing lists and newsgroups. The mailing list for users is available at http://httpd.apache.org/userslist.html.

    Recommended newsgroups: comp.infosystems.www.servers.unix and related groups.

  • If none of the said possibilities provide any solution and you are sure that you have detected a bug in Apache, report it at http://www.suse.de/feedback/.

Further Documentation

Apache

Apache is shipped with detailed documentation. The installation of this documentation is described in 14. Following the installation, you can access the documentation at http://localhost/manual.

The latest documentation is available at the Apache home page at http://httpd.apache.org.

CGI

More information about CGI is available at the following pages:


Security

The latest patches for the SuSE packages are made available at http://www.suse.com/us/security/. Visit this URL at regular intervals. Here, you can also sign up for the SuSE mailing list for security announcements.

The Apache team promotes an open information policy with regard to bugs in Apache. The latest bug reports and possible vulnerable spots are published at http://httpd.apache.org/security_report.html.

If you detect a security bug (check the said pages to make sure it has not already been discovered), report it to the following e-mail address:

security@suse.de

Other sources for information about security issues of Apache (and other Internet programs):

Additional Sources

If you experience difficulties, take a look at the SUSE Support Database at http://sdb.suse.de/en/. An online newspaper focusing on Apache is available at http://www.apacheweek.com/.

The history of Apache is recounted at http://httpd.apache.org/ABOUT_APACHE.html. This page also explains why the server is called ``Apache''.


next up previous contents index
Next: File Synchronization Up: Network Previous: Linux in the Network   Contents   Index
root 2003-11-05