Article

About the Author

Matt Thornton

author_mattthornton Matt is a full-time student at the University of Southampton in the UK, doing a PhD in satellite image processing. Also, a part-time Web developer, he operates freelance programming Website, Thornworx.

View all articles by Matt Thornton...

Cache or Check?

By Matt Thornton

April 19th, 2002

Reader Rating: 8

Page: 1 2 3 4 Next

Slow loading pages are the biggest grievance that today's Internet users have. Given that the majority of people are still on a modem dialup this affects a lot of users. The "56k" label on your modem stands for "56000 bits (of data) per second". Since one character is around 10 bits in size a (static) 5k file should take approximately 1 second to load. But realistically, the speed of your connection on an analogue phone line is going to be around 40k, and if you consider that the index page of the Sitepoint Forums is around 80k in size, on a 40k dialup it'll take around 20 seconds to load.

What this means is that surfing the net can be extremely frustrating as you wait eternities for a page to slowly appear before your eyes - it can be even worse if the page you're loading is image-heavy.

Time is of the Essence

One tool that an Internet Service Provider (ISP) can employ to help overcome this problem is the use of a cache (pronounced "cash"). The concept of the cache has been employed in computer science for many years now and is one reason why today's computers appear to be as quick as they are.

What is a Cache?

Essentially all a cache does is store copies of (or pointers to) previously accessed data. The main implementation in computer architecture is to use a small area of very fast memory (SRAM) to store copies of recently accessed information from your main memory (RAM) or hard drive, which are a lot slower.

For example, open a fairly large local file (say around 500k). Depending on the speed of your system, the first time you access this it could take anywhere between 2 and 10 seconds to open, as the computer looks for the application it needs to open the file, then checks the file to make sure it's OK, and then finally opens it (and this could take even longer if you have anti-virus software installed).

Now, close the file, then open it again, and it should appear magically before your eyes a lot faster. Why? Because your system's built-in cache has remembered the file, and knows exactly where to get it from your hard drive.

The same applies to the Internet. Your ISP inevitably uses some sort of cache, for the simple reason that it improves the speed at which Web pages are delivered to your screen. If you have ever wondered how the Internet works, here is a basic synopsis. The user dials-up to a server, which is "plugged-in" to the Net. The user then types in the URL (Uniform Resource Locator, or, the address) of a page they want to view. This sends a request to the server of the ISP. This server then looks up the specified address across the Internet. If it doesn't find anything, it returns an error (normally an Error 404 Document Not Found). If it finds a matching address, it retrieves from the host server a copy of the document you want, and returns it to your browser, displaying it on your screen.

Admittedly all this happens in a matter of nanoseconds, but assuming your ISP has found the file you're after, this is when the speed of your dial-up comes into play. Big files equal slow download times and a Web cache can speed this up by sitting between you and your ISP.

How Does a Cache Work?

Consider the routine that just occurred -- you sent a request for a file, the server then went and hunted for the file, and if it found it, it fetched a copy for you. Now this would seem to be the only way for such an operation to work, but in fact it's pretty inefficient. Imagine that at any one time there might be thousands of users all requesting the same page. Without a cache, the ISP's server has to keep going back to the same address and getting the document for each individual request. However, if a Web cache is used, the routine is altered slightly.

A Web cache is basically a large hard drive where copies of documents are stored. So with a Web cache in use, the operation to retrieve a Web page changes. It now works like this: The first time there's a request for a page, the server of your ISP has a look in the cache for a copy of that page. But as this is the first request for the page, the serverwon't find it, so it busies off to the actual address and returns a copy to you the user - but this time saves a copy of the file in its cache. Now, the second time the file is requested, the ISP's server again looks in the cache, and hey presto - there is the requested file! It then simply sends the copy back to the user.

As you can see, this process is a lot more efficient, as the time taken to fetch a file is dramatically reduced, and the server can go back to finding and delivering other pages, rather than having to go and hunt around on the Net for this frequently-used page.

Seems simple doesn't it!?

Well, the concept is, but in practice it can get tricky. The above example describes what happens for dial-up users. Other stages exist in this process too, though. For example most Windows 9x and NT users will have a cache on their machine, labelled either Cache or the "Microsoft friendly" term, Temporary Internet Files.

To test this, find a site on the Internet which has a static HTML file. Let the page load fully, then close your browser, and kill your Internet connection. Open your browser again, ensuring it is set to "Work Offline" and put the Web address back into your browser. If all is working as it should, then the page should be displayed, even though you have no current connection to the Internet.

We can see then that the routine for viewing a Web page has changed again. The process is now like this: Page request from your computer, check your computer cache, then to ISP, ISP checks its cache, then finally it goes and gets the page from the actual origin. Getting worried? I'll cover the problems this produces later on.

If you liked this article, share the love:
Print-Friendly Version Suggest an Article

Sponsored Links