Article

Web Site Optimization: 13 Simple Steps

Page: 1 2 3 4 Next

Place Assets on a Cookie-free Domain

If you set a lot of cookies, the request headers for your pages will increase in size, since those cookies are sent with each request. Additionally, your assets probably don't use the cookies, so all of this information could be repeatedly sent to the client for no reason. Sometimes, those headers may even be bigger than the size of the asset requested -- these are extreme cases of course, but it happens. Consider downloading those small icons or smilies that are less than half a kB, and requesting them with 1kB worth of HTTP headers.

If you use subdomains to host your assets, you need to make sure that the cookies you set are for your canonical domain name (e.g. www.example.org) and not for the top-level domain name (e.g. example.org). This way, your asset subdomains will be cookie-free. If you're attempting to improve the performance of an existing site, and you've already set your cookies on the top-level domain, you could consider the option of hosting assets on new domains, rather than subdomains.

Split the Assets Among Domains

It's completely up to you which assets you decide to host on i1.example.org and which you decide to host on i2.example.org -- there's no clear directive on this point. Just make sure you don't randomize the domain on each request, as this will cause the same assets to be downloaded twice -- once from i1 and once from i2.

You could aim to split your assets evenly by file size, or by some other criterion that makes sense for your pages. You may also choose to put all content images (those that are included in your HTML with <img /> tags) on i1 and all layout images (those referenced by CSS's background-image:url()) on i2, although in some cases this solution may not be optimal. In such cases, the browser will download and process the CSS files and then, depending on which rules need to be applied, will selectively download only images that are needed by the style sheet. The result is that the images referenced by CSS may not download immediately, so the load on your asset servers may not be balanced.

The best way to decide on splitting assets is by experimentation; you can use Firebug's Net panel to monitor the sequence in which assets download, then decide how you should spread components across domains in order to speed up the download process.

Configure DNS Lookups on Forums and Blogs

Since you should aim to have no more than four DNS lookups per page, it may be tricky to integrate third-party content such as Flickr images or ads that are hosted on a third-party server. Also, hotlinking images (by placing on your page an <img /> tag whose src attribute points to a file on another person's server) not only steals bandwidth from the other site, but also harms your own page's performance, causing an extra DNS lookup.

If your site contains user-generated content (as do forums, for example), you can't easily prevent multiple DNS lookups, since users could potentially post images located anywhere on the Web. You could write a script that copies each image from a user's post to your server, but that approach can get fairly complicated.

Aim for the low-hanging fruit. For example, in the phpBB forum software, you can configure whether users need to hotlink their avatar images or upload them to your server. In this case, uploaded avatars will result in better performance for your site.

Use the Expires Header

For best performance, your static assets should be exactly that: static. This means that there should be no dynamically generated scripts or styles, or <img> tags pointing to scripts that generate dynamic images. If you had such a need -- for example, you wanted to generate a graphic containing your visitor's username -- the dynamic generation could be taken "offline" and the result cached as a static image. In this example, you could generate the image once, when the member signs up. You could then store the image on the file system, and write the path to the image in your database. An alternative approach might involve scheduling an automated process (a cron job, in UNIX) that generates dynamic components and saves them as static files.

Having assets that are entirely static allows you to set the Expires header for those files to a date that is far in the future, so that when an asset is downloaded once, it's cached by the browser and never requested again (or at least not for a very long time, as we'll see in a moment).

Setting the Expires header in Apache is easy: add an .htaccess file that contains the following directives to the root folder of your i1 and i2 subdomains:

ExpiresActive On  
ExpiresDefault "modification plus 10 years"

The first of these directives enables the generation of the Expires header. The second sets the expiration date to 10 years after the file's modification date, which translates to 10 years after you copied the file to the server. You could also use the setting "access plus 10 years", which will expire the file 10 years after the user requests the file for the first time.

If you want, you can even set an expiration date per file type:

ExpiresActive On  
ExpiresByType application/x-javascript "modification plus 2 years"  
ExpiresByType text/css "modification plus 5 years"

For more information, check the Apache documentation on mod_expires.

Name Assets

The problem with the technique that we just looked at (setting the Expires header to a date that's far into the future) occurs when you want to modify an asset on that page, such as an image. If you just upload the changed image to your web server, new visitors will receive the updated image, but repeat visitors won't. They'll see the old cached version, since you've already instructed their browser never to ask for this image again.

The solution is to modify the asset's name -- but it comes with some maintenance hurdles. For example, if you have a few CSS definitions pointing to img.png, and you modify the image and rename it to img2.png, you'll have to locate all the points in your style sheets at which the file has been referenced, and update those as well. For bigger projects, you might consider writing a tool to do this for you automatically.

You'll need to come up with a naming convention to use when naming your assets. For example, you might:

  • Append an epoch timestamp to the file name, e.g. img_1185403733.png.
  • Use the version number from your source control system (cvs or svn for example), e.g. img_1.1.png.
  • Manually increment a number in the file name (e.g. when you see a file named img1.png, simply save the modified image as img2.png).

There's no one right answer here -- your decision will be depend on your personal preference, the specifics of your pages, the size of the project and your team, and so on.

If you use CVS, here's a little PHP function that can help you extract the version from a file stored in CVS:

function getVersion($file) {  
 
   $cmd = 'cvs log -h %s';  
   $cmd = sprintf($cmd, $file);  
 
   exec($cmd, $res);  
   $version = trim(str_replace('head: ', '', $res[3]));  
 
   return $version;  
}  
 
// example use  
$file = 'img.png';  
$new_file = 'img_' . getVersion($file) . '.png';

Serve gzipped Content

Most modern browsers understand gzipped (compressed) content, so a well-performing page should aim to serve all of its content compressed. Since most images, swf files and other media files are already compressed, you don't need to worry about compressing them.

You do, however, need to take care of serving compressed HTML, CSS, client-side scripts, and any other type of text content. If you make XMLHttpRequests to services that return XML (or JSON, or plain text), make sure your server gzips this content as well.

If you open the Net panel in Firebug (or use LiveHTTPHeaders or some other packet sniffer), you can verify that the content is compressed by looking for a Content-Encoding header in the response, as shown in the following example:

Example request:

GET /2.2.2/build/utilities/utilities.js HTTP/1.1  
Host: yui.yahooapis.com  
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.5) Gecko/20070713 Firefox/2.0.0.5  
Accept-Encoding: gzip,deflate

Example response:

HTTP/1.x 200 OK  
Last-Modified: Wed, 18 Apr 2007 17:36:33 GMT  
Vary: Accept-Encoding  
Content-Type: application/x-javascript  
Content-Encoding: gzip  
Cache-Control: max-age=306470616  
Expires: Sun, 16 Apr 2017 00:01:52 GMT  
Date: Mon, 30 Jul 2007 21:18:16 GMT  
Content-Length: 22657  
Connection: keep-alive

In this request, the browser informed the server that it understands gzip and deflate encodings (Accept-Encoding: gzip,deflate) and the server responded with gzip-encoded content (Content-Encoding: gzip).

There's one gotcha when it comes to serving gzipped content: you must make sure that proxies do not get in your way. If an ISP's proxy caches your gzipped content and serves it to all of its customers, chances are that someone with a browser that doesn't support compression will receive your compressed content.

To avoid this you can use the Vary: Accept-Encoding response header to tell the proxy to cache this response only for clients that send the same Accept-Encoding request header. In the example above, the browser said it supports gzip and deflate, and the server responded with some extra information for any proxy between the server and client, saying that gzip-encoded content is okay for any client that sends the same Accept-Encoding content.

There is one additional problem here: some browsers (IE 5.5, IE 6 SP 1, for instance) claim they support gzip, but can actually experience problems reading it (as described on the Microsoft downloads site, and the support site). If you care about people using these browsers (they usually account for less than 1% of a site's visitors) you can use a different header -- Cache-Control: Private -- which eliminates proxy caching completely. Another way to prevent proxy caching is to use the header Vary: *.

To gzip or to Deflate?

If you're confused by the two Accept-Encoding values that browsers send, think of deflate as being just another method for encoding content that's less popular among browsers. It's also less efficient, so gzip is preferred.

Make Sure you Send gzipped Content

Okay, now let's see what you can do to start serving gzipped content in accordance with what your host allows.

Option 1: mod_gzip for Apache Versions Earlier than 2

If you're using Apache 1.2 and 1.3, the mod_gzip module is available. To verify the Apache version, you can check Firebug's Net panel and look for the Server response header of any request. If you can't see it, check you provider's documentation or create a simple PHP script to echo this information to the browser, like so:

<?php echo apache_get_version(); ?>

In the Server header signature, you might also be able to see the mod_gzip version, if it's installed. It might look like something like this:

Server: Apache/1.3.37 (Unix) mod_gzip/1.3.26.1a.....

Okay, so we've established that we want to compress all text content, PHP script output, static HTML pages, JavaScripts and style sheets before sending them to the browser. To implement this with mod_gzip, create in the root directory of your site an .htaccess file that includes the following:

mod_gzip_on Yes  
 
mod_gzip_item_include mime ^application/x-javascript$  
mod_gzip_item_include mime ^application/json$  
mod_gzip_item_include mime ^text/.*$  
 
mod_gzip_item_include file \.html$  
mod_gzip_item_include file \.php$  
mod_gzip_item_include file \.js$  
mod_gzip_item_include file \.css$  
mod_gzip_item_include file \.txt$  
mod_gzip_item_include file \.xml$  
mod_gzip_item_include file \.json$  
 
Header append Vary Accept-Encoding

The first line enables mod_gzip. The next three lines set compression based on MIME-type. The next section does the same thing, but on the basis of file extension. The last line sets the Vary header to include the Accept-Encoding value.

If you want to send the Vary: * header, use:

Header set Vary *

Note that some hosting providers will not allow you to use the Header directive. If this is the case, hopefully you should be able to substitute the last line with this one:

mod_gzip_send_vary On

This will also set the Vary header to Accept-Encoding.

Be aware that there might be a minimum size condition on gzip, so if your files are too small (less than 1kb, for example), they might not be gzipped even though you've configured everything correctly. If this problem occurs, your host has decided that the gzipping process overhead is unnecessary for very small files.

If you liked this article, share the love:
Print-Friendly Version Suggest an Article

Sponsored Links