Article

Installing Apache Tutorial

Page: 1 2 3 4 Next

Configuring Apache


Though it is generally thought to be scary to edit a configuration file for anything, for Apache, it doesn't have to be. The configuration can be found in the apache_1.3.9/conf/ directory, and, though there are a bunch of files available, you only really need three of them.

Firstly, there is httpd.conf, which contains directives and configurations relating to the operation of the server as a whole. For example, server logs and server management.

Next on the list, there is srm.conf. This file contains the configurations for the management of resources in the filesystem, such as aliases, directory indexes, etc.

Lastly, access.conf. This file contains information on access control in whatever directories you please. All the other files, well you can just leave them be.

When you first install Apache though, the files are not named exactly like this, they seem to be named name.conf-dist. This means it is the distribution copy of the file. Since we like to have backup copies of anything and everything, just use a command similar to this (from unix)

cp httpd.conf-dist httpd.conf

Or, for windows, just copy the file, and rename it.

Another note, before we begin, you must restart Apache before any changes to the configuration files take effect. This is because the files are loaded upon initiation, so these changes will not be loaded otherwise.

Know your Directives


The Apache documentation can be extremely helpful for users just starting to configure their server.

At http://www.apache.org/docs-1.2/mod/directives.html, you can find a listing of available directives (configurations) for your server, with explanations as well. It should be noted though that the directives listed are for version 1.2, not the new 1.3 version. Most of them should be the same, but some may be gone, and new ones may exist.

Doesn't sound like a tip? Well, it's hard to configure something exactly for your needs without knowing exactly what it can do. So, with that said, this is definitely an important tip.

There is also a book available containing the directives. The Apache Web Server Installation and Administration Guide, which details the process of installing and administering the Apache Web Server, is a handy desktop companion. It also includes a printed version of the installation and general administration portions of the Apache documentation.

Knowing your directives can also make your life more convenient. If you don't like where Apache has set your DocumentRoot (where you put your HTML and other files), you can easily change it in httpd.conf with this directive:

DocumentRoot /where/you/want/to/put/your/docs

Or if you want to set where e-mails are to be sent if there are server problems, just use this directive:

ServerAdmin you@your.address

Now that should be an excuse to get out and learn more... it will make your life easier.

Use Server Side Includes


When you don't just want to serve your users plain old HTML, but don't have the expertise yet to write up some CGI scripts (or are just lazy), try out Apache's Server Side Includes (SSI).

So what is SSI?

Basically, it is an HTML document containing special commands within. If it is named appropriately, Apache will pre-parse the document when it is requested, and send the resulting document. SSI is controlled by the module named mod_include.

With SSI, you may do a variety of things, such as displaying some environment variables, displaying the date, or even including external files within your document.

Before you can use SSI, you must make sure the mod_include module was compiled with the rest of Apache. It is included by default, but you may have commented it out when editing Configuration.tmpl earlier. If that is the case, uncomment the line, and re-compile Apache. Once that is done, open up your httpd.conf file.

To use server-parsed HTML files, look for the section found below:

AddType text/html .shtml
AddHandler server-parsed .shtml

Make sure you get rid of the comments in front of the lines (the #'s), as they are commented by default. Notice how they use .shtml? You can change this to any extension you want really, you just have to remember to name your SSI files later with these extensions.

After you change these directives in the configuration files, restart Apache, and let's make a test SSI file. Since we are using .shtml as our SSI extension, we will name our file test.shtml

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD><TITLE> Test Server Side Includes </TITLE></HEAD>
<BODY BGCOLOR="#FFFFFF">
Current Date: <!--#echo var="DATE_LOCAL" -->
<P>Document Name: <!--#echo var="DOCUMENT_NAME" -->
<P>Environmental Variables:<BR>
<PRE><!--#printenv --></PRE>
</BODY>
</HTML>

If it looks right, and all your blanks are filled in, SSI is properly enabled on your server. Congratulations!

For more information on SSI, check out TheScripts.com's SSI section at http://ssi.thescripts.com/.

Make those CGI's Work


After you graduated from SSI, it's time to setup your server to use CGI's. CGI stands for Common Gateway Interface, but for some reason people always confuse it with Perl, the most popular language used for CGI. In fact, CGI can be any language really, such as C/C++, Java, TCL, or Perl, plus many others.

The module controlling CGI in Apache is called mod_cgi and is compiled by default. If you removed it from Configuration.tmpl though, you must add it back in, and re-compile Apache before you can proceed.

The first method of enabling CGI is to create a particular directory that contains scripts. A ScriptAlias is just like a normal Alias, except the documents are treated as applications, and not treated as documents when requested by the client.

A typical ScriptAlias setup looks like so

ScriptAlias /cgi-bin/ "@@ServerRoot@@/cgi-bin/"

This makes /cgi-bin/ an alias to your Server Root/cgi-bin/ directory. @@ServerRoot@@ reflects the ServerRoot variable you set near the top of the configuration file. You can write it any way really, such as

ScriptAlias /cgi-bin/ "/usr/local/apache/cgi-bin/"

or wherever you desire to put the cgi-bin directory. You would follow the above configuration with the following

<Directory "/usr/local/apache/cgi-bin/">
Options ExecCGI
AddHandler cgi-script .cgi .pl
</Directory>

The AddHandler part allows you to have CGI scripts with .cgi and .pl extensions in this case. Here is the other method. This method allows you to use CGI outsite of ScriptAliased directories. You would use the following configuration;

AddHandler cgi-script .cgi .pl

Notice it isn't found inside a Directory container? This way, the directive AddHandler applies to all of the documents, not just the documents found within a certain directory. Here, we allow .cgi and .pl extensions once again for our CGI scripts. You can use whatever.

CGI applications, when executed, are run under the same name as the user that owns the Apache server process. By default, the user is called 'nobody', as specified by the User and Group Directive. These directives can only work if the Apache server is started by the user root.

User nobody
Group #-1

In most cases, you won't have to change a thing here. This is not always suitable though, as, if you have multiple users on your system, you might want CGI's to run under the name of the user. This is excellent for virtual hosting systems, as you could have the CGI run under the customer's name and access their files.

To counter this problem, Apache created the suEXEC program, included with the Apache server. To learn more about how to use it for your system, check out http://www.apache.org/docs-1.2/suexec.html.

Error Documents


Ever notice how some sites have their own error documents? For instance, try going here (http://www.thescripts.com/nopage.html). You get a 404 error, but not just any 404 error, we have a customized 404 error page. You can do this for any other error you can get as well (we also have a 403 error page). FYI, 404 errors mean the page was not found, and 403 means you do not have access to a particular document.

So how do we do it? It's actually quite simple. Instead of using Apache's default/kinda ugly error pages, you can specify certain files to be returned for certain errors. The directive being used here is called ErrorDocument, and below we find an example

ErrorDocument 404 /fourohfour.html

This is, as you might have figured out, the 404 error page. It is located in this case in the htdocs directory, and called fourohfour.html. If you accessed it from the browser, it would be at http://www.yoursite.com/fourohfour.html

To add another type of error document, for instance, 403, you would just do the following;

ErrorDocument 403 /fourohthree.html

So as you can see, the first argument in the directive ErrorDocument is the error number. The second is the location of the page.

Just a note, it would be best to use full image and link paths in an error document. This is because using relative links/image paths would be inaccurate. For instance, if you went to http://www.thescripts.com/nothere/nothere.html, the fourohfour.html page would be served as if it was located in the nothere directory, yet it is really in the root directory of your html documents. For more information on the ErrorDocument directive, visit http://www.apache.org/docs/mod/core.html

Content Negotiation


I know what you are thinking, what is content negotiation? Well, first of all, it's an often overlooked feature of Apache. It is more accurately known as content selection, and is the selection of documents that best match a client's browser capabilities, from one of several available documents.

Why do you need it?

Well, you don't... but, if you are aiming your Website at a multi-lingual region, then having it would greatly be to your benefit. Not only will your documents be correctly served to the visitors, but it will also improve their stay at your site.
By default, content negotiation is compiled in with your server, as it is powered by mod_negotiation. If for some reason you didn't compile this feature in with Apache, you must go back to your Configuration.tmpl file and put it back in.

Ok, what do you need now?

Well, you have to choose what languages you are going to be using, and have content available for both languages. There are actually two methods we can use: using a variants file, which can be found discussed here (http://www.apacheweek.com/features/negotiation), or, by using file extensions.

For this example, we will be using three different languages, English, French and German.

Now open up the access.conf file and let's get started. Firstly, we will need to specify a directory, which we will call international, and have it so that it is set for content negotiation. The international directory will be located just off of the root document directory, which, in our case, is /usr/local/apache/htdocs/.

<Directory /usr/local/apache/htdocs/international>
Options MultiViews
</Directory>

Options MultiViews sets the directory so that the server does filename pattern matching to choose from among the results. Now, open up httpd.conf so we can add our languages to look for, and how the server will identify them. We will be using the directive AddLanguage

AddLanguage en .en
AddLanguage fr .fr
AddLanguage de .de

Here we have added English, French, and German, all identified with the extension found to their right.

By default, quite a few languages are already set. You can comment them out and add to them as needed.

The LanguagePriority directive allows you to give precedence to some languages in case of a tie during content negotiation. The languages are listed in decreasing order or preference.

LanguagePriority en fr de

So how do you name your files now? The best method is to do something like test.html.en for an English document, test.html.fr for a French document, and test.html.de for a German document. When you would like to link to this document, you just use test.html.

For more information on Content Negotiation, see http://www.apache.org/docs/content-negotiation.html.

A Single Config File


Geez, why would this help?

Well, it would put all your options into one single file, which makes life easy when you want to edit anything. In this case, we will use httpd.conf to store all of our directives. For some commentary on the three configuration files, see http://www.apache.org/info/three-config-files.html.

Anyways, back to the single config file issue. Right now you are using three configuration files: httpd.conf, access.conf, and srm.conf. You don't need to. Simply empty the important stuff from the latter two files into httpd.conf, and then also add these two lines:

AccessConfig /dev/null
ResourceConfig /dev/null

This will inform Apache that httpd.conf is the only configuration file. With this in place, you can just simply remove the srm.conf and access.conf files, and proceed with a smile on your face.

Rotating Logs


So what are these log files you keep hearing about?

Every time a user requests a document from your site, the server 'logs' a record of this request into a log file. Log files contain vital statistics about users, such as their host, the date/time, and the request line, which contains browser information.

If your site is even remotely busy, or it's just been a while since you've done anything to the logs, they will probably be rather large. Typically, they can reach many MB's in a short period of time. Since the best application for log files is for log analysis with other software, large log files can slow down this process to a crawl.

So, what can you do about it? Use log rotation.

Log rotation is a procedure that takes a log file, puts its contents into a new file, and clears its own contents afterwards. Since its a rather boring procedure, there have been a few scripts to help you do it, including one from the Apache group itself.

One script, for example, is a small utility written to allow log files to be quickly split and processed. This utility will create a new log file for each day, and with the date as the extension. Once the new log file is created, the old one can be compressed or moved to a new location. The utility is called logbox, and is available from ftp://ftp.lemuria.org/pub/Code/logbox.tar.gz. You must compile it on your server before you can use it.

Another method of splitting files is made by the Apache group. Earlier you compiled it, which was also when you compiled apachectl and htpasswd. If you used the default settings, you also copied it to /usr/local/apache/bin/ afterwards. This program is called rotatelogs (how ingenious!), and it is used with the TransferLog directive in Apache.

In httpd.conf, add something like this:

LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\""
ErrorLog /usr/local/apache/logs/error_log
TransferLog "|/usr/local/apache/bin/rotatelogs /usr/local/apache/logs/access_log 86400"

Let me explain this...

The directive LogFormat really isn't anything to worry about actually. It just specifies the template format for how the access and error logs will look. For more information on the options LogFormat provides, see http://www.apache.org/docs-1.2/mod/mod_log_config.html.

ErrorLog sets the name of the file to which the server will log any errors it encounters. This is important to look over from time to time to see what might be going wrong with your site/server.

Now, to the part you've been waiting for.

The TransferLog directive. Here, we are mixing it up with the rotatelogs program at the same time. Technically, there is one argument to TransferLog here, but it can be broken down. The first part is the location to our rotatelogs program (in this case, it is at /usr/local/apache/bin/rotatelogs).

The second part is the location of our access_log file (The file in which we log our server requests, located at /usr/local/apache/logs/access_log). The last part sets the amount of time before rotating the log files. It is in seconds, and since 86400 seconds is equal to 1 day (24 hours), we rotate the file every day. You can set this to whatever time period you would like. The generated name of the old log file (after each rotation) will be /usr/local/apache/logs/access_log.nnnn in this case, where nnnn is the system time the log started at. After each rotation time, a new, empty log is started.

Get some Extras


Since Apache is open source, there are a ton of extras built for it. These extras can greatly enhance the performance of your server, and even give you more possibilities with your Website. Two of the most notable add-on's include mod_perl, and mod_php

mod_perl is an excellent binding of the Apache Web server and Perl (http://www.perl.com/), the popular CGI scripting language. The module takes a copy of the Perl interpreter, and embeds it within Apache itself. This not only speeds up existing CGI's, but it also allows you to write more modules in Perl itself. The Perl scripts are compiled once this way, unlike the usual method. Usually, the scripts are compiled on run-time by the interpreter, which makes them run a little slower from the start. If they are already compiled though, they start instantaneously, making this module an excellent addition to a Website with high levels of traffic.

mod_php is another great addition to Apache. The powerful server-side scripting language, PHP (http://www.php.net/), which is also open-source/free, can easily be linked with the server upon compilation. PHP scripts can be run like normal CGI's, but this method allows them to be run at a much greater speed.

Both of these add-ons must be compiled in with Apache from the start though, as they are modules. If you would like to use them, just download them, edit your Configuration.tmpl file, run configure, and compile as usual.

There are instructions for these modules though, as they are a little more complicated. You can the documentation at http://perl.apache.org/ and http://www.php.net/ respectively.

Buy a Book


I must say this is the easiest place to find excellent tips. Not only are they always available to you (as it should be sitting next to you), but if you don't have a laptop computer in the washroom, you can just read it there as well.
There are a few good books, and, as usual, my favourites are published by O'Reilly.

Apache: The Definitive Guide

This book boasts one of the members of the Apache development team. It begins with an academic discussion of what Web servers do before walking the reader through the process of installing Apache. The installation of Apache gets a lot of attention, and you are taught about Website security and other preferences.

Apache Server Bible

The Apache Server Bible is an excellent guide to administering your Apache Web server, and is aimed at those with no previous Web administration experience. Topics include compiling source code, installation, configuration issues.

Apache Server for Windows Little Black Book: Little Black Book

This book will show you how to put Apache's capabilities to work. The book even moves on to things like CGI, database management, encryption and more.

If you liked this article, share the love:
Print-Friendly Version Suggest an Article

Sponsored Links