Article
Get XSL To Do Your Dirty Work
Writing content management systems (CMSs) for a living is a messy business, especially if you rely exclusively on server-side scripting languages like PHP. No matter how well you write the code for a CMS, no matter how much object oriented modularity you throw into it, you're still going to have to get up to your elbows in troublesome, unreliable code when it comes time to formatting your content for display.
If this sounds familiar to you, if you find yourself messing with tiresome code based on complex regular expressions every time you need to tweak the formatting of your site's articles, it might just be time to take a look at XSL.
Reinventing the Wheel
If you're anything like me, you've worked on a number of content-driven sites and have come up with a pretty standard formula for the design of the content managemet systems they rely on:
- You create a simple, custom set of tags for users to format their articles (tutorials, FAQs, reviews, or what have you) with.
- You store the text of the articles, peppered with these custom tags, in a database.
- When a visitor to the site views one of the articles, you have a big mess of code that translates that tagged text into a neatly-formatted HTML page for them to view.
This structure is illustrated in Fig. 1 for an average PHP/MySQL-based site:
Fig. 1: A Typical Content-Based Site DesignNow, systems like this certainly work, and usually work very well. So what's the problem? Depending on what sort of developer you are, you're likely to run into one of two problems with this approach:
- Lack of robustness
- Complex code for a relatively simple task
Let's look at an example. The [b]...[/b] tag illustrated in Fig. 1 is a fairly straightforward matter to convert to the equivalent HTML syntax for display. Here's how it might be done in PHP:
$document = str_replace('[b]','<b>',$document);
$document = str_replace('[/b],'</b>',$document);
No brainer, right? But what if someone forgets to type the closing [/b] tag? What if he or she uses two [b] tags instead? Well, either you live with the invalid HTML output that will result from such mistakes, or you step up your code to the next level of complexity, by using a regular expression to detect only valid pairs of tags:
$document = ereg_replace('\[b\](.*)\[/b\]','<b>\\1</b>',$document);
Better, but this code still doesn't point out coding mistakes like typing an invalid [v] tag when [b] was intended; it just ignores them. To catch mistakes like those would require even more complex code... and all this to process what is likely to be one of the simplest tags in your system! Imagine the nightmare involved in making sure that a [list] tag contained one or more [*] tags, and that [*] tags didn't occur outside of [list] tags!
Most sites that follow the design pattern discussed above will settle on simple, custom tag processing code that lacks the robustness to enforce these types of constraints and prevent operator error.
So, what's the alternative? Don't reinvent the wheel! Not only will a system built with XSL do all the parsing and checking for you, but it was designed from the ground up to convert custom tag-based documents into HTML pages and other popular document formats with a minimum of fuss. Sound good? Read on!
Kevin began developing for the Web in 1995 and is a highly respected technical author. He wrote