Article

Cost-Effective Website Acceleration

Page: 1 2 3 4 5 6 Next

Part 3 - Server Side Modifications

In the first part of this series, we introduced the two basic laws of Web performance. To refresh your memory, these are:

  1. Send as little data as possible

  2. Send it as infrequently as possible

In that article, we focused on rule one and offered twenty tips to squeeze every byte out of delivered pages through code optimization, looking well beyond the obvious bandwidth-hogging images, to JavaScript, HTML, CSS, and even file name optimizations.

In the second installment, we turned to rule two and saw how to enhance site performance by using cache control headers. We discovered that the best approach to cache control was at the server level.

So, in this final installment, we'll see what other server-side changes can be made in order to speed up site delivery, starting with HTTP compression.

What Exactly Is HTTP Compression?

HTTP compression is a long-established Web standard that is only now receiving the attention it deserves. The basic idea of HTTP compression is that a standard gzip or deflate encoding method is applied to the payload of an HTTP response, significantly compressing the resource before it is transported across the Web.

Interestingly, the technology has been supported in all major browser implementations since early in the 4.X generation (for Internet Explorer and Netscape), yet few sites actually use it. A study by Port80 Software showed that less than 4% of Fortune 1000 Websites employ HTTP compression on their servers. However, on leading Websites like Google, Amazon, and Yahoo!, HTTP content encoding is nearly ubiquitous. Given that it provides significant bandwidth savings to some of the biggest sites on the Web, progressive administrators owe it to themselves to explore the idea of HTTP compression.

The key to HTTP content encoding can be found in the Accept request headers sent by a browser. Consider the request from Mozilla Firefox below, and note in particular the Accept, Accept-Language, Accept-Encoding, and Accept-Charset headers:

GET / HTTP/1.1    
Host: www.port80software.com    
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040206 Firefox/0.8    
Accept:text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,    
text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1    
Accept-Language: en-us,en;q=0.5    
Accept-Encoding: gzip,deflate    
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7    
Keep-Alive: 300    
Connection: keep-alive

These "accept" values can be used by the server to determine the appropriate content to send back using Content Negotiation -- a very powerful feature that allows Web servers to return different languages, character sets, and even technologies based on user characteristics. Content negotiation is a very broad topic, so we'll focus solely on the element which relates to server-side compression. The Accept-Encoding header indicates the type of content encoding that the browser can accept beyond the standard plain text response, in this case gzip- and deflate-compressed content.

Looking at Internet Explorer's request headers, we see similar Accept-Encoding values:

GET / HTTP/1.1    
Host: www.google.com    
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)    
Accept:image/gif,image/x-xbitmap,image/jpeg,image/pjpeg,application/vnd.ms-excel,application/vnd.ms-powerpoint,application/msword,application/x-shockwave-flash,*/*    
Accept-Encoding: gzip,deflate    
Accept-Language: en-us    
Connection: keep-alive

Given that nearly every major browser in use today supports gzip and deflate encoding (and that those few that don't should not be sending the Accept-Encoding headers), we can easily modify Web servers to return compressed content to some browsers and standard content to others. As an example (illustrated below), if our browser tells Google that it does not accept content encoding, we get back 3,358 bytes of data; however, if we do send the Accept-Encoding header, we get back compressed data of just 1,213 bytes along with a response header saying Content-Encoding: gzip. You won't see any differences between the pages if you "view source," but if you have a network trace, you will notice that the response is different:

acceleration3fig1
Google Compressed / Uncompressed Comparison

While, in this case, the files are small, you can see that the reduction is still significant -- in this case, a 74% smaller file size. Through a combination of HTML, CSS, JavaScript code optimization (as discussed in Part I of this series) and HTTP content encoding, Google achieves an impressive feat -- fitting its page into a single TCP response packet!

While Google may have bandwidth concerns far beyond those of the average Website, HTTP content encoding can decrease HTML, CSS, JavaScript, and plain text file size by 50% or more. Unfortunately, HTTP content encoding (the terms "compression" and "content encoding" are roughly synonymous) really only applies to text content, as compressing binary formats like image files generally provides no value. Even assuming that binary files make up the bulk of the payload of the average site, you should still see, on average, a 15-30% overall reduction in page size if HTTP content encoding is used.

Server Support for HTTP Content Encoding

If you're already convinced of the value of HTTP compression, the next big question is: how do you employ it? In the case of the Apache Web server, it is possible to add HTTP content encoding using either mod_gzip or mod_deflate.

In the case of Microsoft IIS, things can get a little sticky. While IIS 5 includes native support for gzip encoding, it is a notoriously buggy implementation, especially considering the fine-grained configuration changes that must be made to overcome a wide variety of browser nuances. So, in the case of IIS 5, third party compression add-ons in the form of ISAPI filters, such as httpZip, PipeBoost, and XCompress, are most often the best way to go.

IIS 6 built-in compression is much faster and more flexible, but it is still difficult to configure in more than a basic manner without getting into the IIS Metabase. ZipEnable represents the first tool designed to allow for truly fine-grained management of IIS 6 built-in compression.

The Real Deal with Server-Side Content Encoding

There is an important trade-off to be considered when you implement HTTP compression; if you configure your server to compress content on the way out, you may reduce bandwidth usage, but at the same time, you'll increase CPU load. In most cases, this is not a problem, especially given how little work Web servers actually do.

However, in the case of a very highly trafficked Website running a large amount of dynamic content on servers that are already at the limit of available CPU cycles, the downsides of compression may actually outweigh the advantages. Adding extra server hardware would, of course, alleviate the problem and allow you to enjoy the substantial bandwidth savings offered by compression. It's up to you to determine whether the reduction in bandwidth expenses and other infrastructure costs (fewer routers, switches, and dedicated lines) outweighs the upfront investment in new hardware.

Ultimately though, the most interesting aspect of HTTP compression is what developers and administrators expect to see when rolling it out, versus what they actually see. While you will definitely find that bandwidth utilization decreases, all your users may not enjoy dramatically faster page loads. Because of the increased CPU load created by the compression and decompression process, time to first byte (TTFB) generally increases; thus, browser can't start painting the page slightly later.

For a user with a slow (that is, a low bandwidth) connection, this is still a good trade-off; because the data is compressed into fewer, smaller packets, it will be delivered much faster, so the slight initial delay is far outweighed by the faster overall page paint. Broadband users, on the other hand, will probably not see a perceptible performance improvement with HTTP compression. In both cases, you will save money through bandwidth reduction. But if perceived response time is your primary goal and you serve a lot of dial-up traffic, you may want to first focus on caching (discussed in Part II) as a performance enhancement strategy.

Another potential problem with HTTP content encoding relates to server-load from script-generated pages, such as those in PHP or ASP. The challenge in this case is that the page content may have to be recompressed for every request (rather than being compressed once and then cached), which will add significant load to the server beyond that added by the compression of static content. If all your pages are generated at page load time, you should therefore be careful when adding HTTP content encoding. Fortunately, many commercial compression add-ons will know to cache generated content when possible, but be aware that some cheaper solutions lack this vital feature. However, this "problem" does point to a second, obvious server-side performance improvement -- page pre-caching.

If you liked this article, share the love:
Print-Friendly Version Suggest an Article

Sponsored Links