Article

Get XSL To Do Your Dirty Work

Page: 1 2 3 4 5 Next

Your First XSL Stylesheet

Like the documents they are created to format, XSL stylesheets are XML documents. You can therefore write your first XSL stylesheet in any text editor that you find convenient. Type the following in and save it as docbook.xsl:

<?xml version="1.0" encoding="UTF-8"?>    
<xsl:stylesheet version="1.0"    
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">    
   
 <xsl:output method="xml" indent="yes" encoding="utf-8"    
   doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"    
   doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" />    
   
 <!-- templates go here -->    
   
</xsl:stylesheet>

This is the basic shell for any XSL stylesheet that outputs HTML documents. Once again, you'll see that it starts with the optional (but advisable) <?xml ...?> tag that brands this as an XML file. You need not bother with a <!DOCTYPE> tag, since the XSL processor knows all about XSL files and which tags are and are not allowed without the help of a DTD.

The <xsl:stylesheet> tag should be the outer element of every XSL file. The version attribute indicates that we are using XSL version 1.0 syntax. The xmlns:xsl attribute sets up an XML namespace for all our XSL tags. Basically, this attribute says that all tags that start with xsl: (this is called a prefix) are related to the URL http://www.w3.org/1999/XSL/Transform. If you try going to that URL in your Web browser, you'll see the message "This is the XSLT namespace." This page doesn't actually provide any information to the XSL processor, but all XSL processors will only process tags associated with that URL. This lets you use tags such as <stylesheet> and <output> in your own documents, which will be ignored by the XSL processor. You can use any prefix you like to associate the XSL tags in your document with that URL (e.g. if the attribute were xmlns:exesel="http://www.w3.org/1999/XSL/Transform", then all XSL tags would have to begin with the prefix exesel:), but xsl: is the de facto standard.

Inside <xsl:stylesheet> there is only one tag in our basic 'shell': <xsl:output ... />. This tells the XSL processor that this stylesheet will output an XHTML document, as opposed to, say, a text file. The attributes of this tag may look a little complicated, but really they're just setting the values that will appear in the <?xml ...?> and <!DOCTYPE> tags in the XHTML document generated. The / on the end of this tag indicates that it is an empty tag, and so a closing tag is not needed.

Like in HTML, comments in XML documents are created with <!-- --> tags; thus, the tag <!-- templates go here --> will be ignored by XSL processors.

Let's take a look at what happens when we apply this simple stylesheet to the article we created in the previous section (docbook.xml). With most XSL processors, we would specify the document and stylesheet we want to process, and it would spit out the resulting HTML file. In XSL-aware browsers like Internet Explorer 5+ and Netscape 6+, however, you need to add a tag to your XML document to tell it which stylesheet to use when displaying the document. At the top of docbook.xml, just after the <!DOCTYPE> tag and just before <article>, add the following line:

<?xml-stylesheet href="docbook.xsl" type="text/xsl"?>

This is a processing instruction that tells browser-based XSL processors (and some standalone processors that support it) where to find an XSL stylesheet that is appropriate for this document. In this example, we have told it to use "docbook.xsl", located in the same directory as the current document. Save this change, make sure that the two files are in the same directory, then view docbook.xml in either IE6+ or NS6+. Fig. 5 demonstrates how it should look in MSIE 6.

A DocBook with Minimal StyleFig. 5: A DocBook with Minimal Style

As you can see, the default behavior of an XSL stylesheet is to go through the XML document a tag at a time and print out the text contained therein. To change this behavior and make our document readable, we need to add some rules to our stylesheet. In the language of XSL, these rules are called templates. Here's an example of a template:

 <xsl:template match="/article">    
 <html>    
 <head>    
 <title><xsl:value-of select="title"/></title>    
 </head>    
 <body>    
 <h1><xsl:value-of select="title"/></h1>    
 <xsl:apply-templates select="section"/>    
 </body>    
 </html>    
 </xsl:template>

As you can see, this is a mix of XSL tags (identified by the xsl: prefix, and shown in bold) and familiar HTML tags all contained within an <xsl:template> tag. The majority of XSL templates work by matching tags that appear in the XML document to be processed. The tag(s) to match are specified in the match attribute of the <xsl:template> tag.

In this case, our template is set to match /article. This is an XPath expression (remember, XPath is the standard for pointing to tags in an XML document). The leading / indicates the 'root' of the XML document, so /article means that this template should match any <article> tag that appears in the root of the XML document. Since our DocBook document begins with an <article> tag, this template will match that tag.

So the XSL processor sees that there is a template that matches the <article> tag in the root of our document. Now what? The processor looks inside the <xsl:template> tag to see what to do about it. The template begins with three HTML tags: <html>, <head> and <title>. Since these are not XSL tags (they don't begin with the xsl: prefix), the processor writes these tags straight to the output document.

The next tag is an XSL tag: <xsl:value-of select="title"/>. The <xsl:value-of> tag lets you pick out a tag with an XPath expression and output the text it contains (it's value) at a particular point in the file. The tag to output the value of is specified with the select attribute. In this case, we have select="title". This says that we want to choose the <title> tag that is inside the current tag (the current tag is the tag that matched the template -- <article>). Looking back at the sample document, you should find that the <article> tag contains a <title> tag with the article's title in it ("A Sample Article"). So what we've just done is take that title and use it as the page title in the HTML document to be created!

Note that, since the <xsl:value-of> tag doesn't contain any text or tags, we have made the closing </xsl:value-of> tag part of the opening tag by ending it with a slash (/). Without this shortcut, we would have had to type <xsl:value-of select="title"></xsl:value-of>.

After a few more HTML tags (</title>, </head>, <body>), we have another <xsl:value-of> tag surrounded by HTML <h1>...</h1> tags. This tag is identical to the one used for the title of the page, so once again it will print out the title of our document, but this time between <h1>...</h1> tags, so that it is displayed in big letters at the top of the page.

The next XSL tag in the document is <xsl:apply-templates select="section"/>. This powerful tag tells the XSL processor to take any and all <section> tags that appear within the current tag (<article>) and apply any matching templates to them. At this stage, we only have this one template in our XSL stylesheet, so the default behavior of outputting the contents of the tags and any subtags takes effect.

Once the two the <section> tags are processed in this manner, the XSL processor returns here to finish this template by outputting the </body> and </html> tags. Having reached the end of the document (there are no more tags after the closing </article> tag), the XSL processor terminates.

Here's the HTML document that is produced from our sample document by our XSL stylesheet:

<?xml version="1.0" encoding="utf-8"?>    
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"    
"http://www.w3.org/TR/xhtml11/DTD/xhtml1-strict.dtd">    
<html>    
  <head>    
     <title>A Sample Article</title>    
  </head>    
  <body>    
     <h1>A Sample Article</h1>    
     Article Section 1    
   
     This is the first section of the article. Nothing terribly    
     interesting here, though.    
   
     Another Section    
   
     Just so you can see how these things work, here's an    
     itemized list:    
   
   
     The first item in the list    
   
     The second item in the list    
   
     The third item in the list    
   
   
  </body>    
</html>

If you update your copy of docbook.xsl with the above template and then view docbook.xml in your browser again, you'll see this HTML document displayed as in Fig. 6.

A Slightly More Stylish DocBookFig. 6: A Slightly More Stylish DocBook

Let's add a few more templates to the stylesheet:

 <xsl:template match="section">    
   <xsl:apply-templates/>    
   <hr/>    
 </xsl:template>

This template matches <section> tags, and will be triggered by the <xsl:apply-templates select="section"/> tag in our previous template above. For each <section> in the <article>, it will apply templates (or default behavior) to any sub-tags and then output a <hr/> tag.

 <xsl:template match="section/title">    
   <h2><xsl:apply-templates/></h2>    
 </xsl:template>

This template matches <title> tags that occur inside <section> tags (i.e. it will not match the <title> tag at the top of the <article>), and outputs the contents of the tag (applying any applicable templates) between <h2>...</h2> tags.

These remaining three templates should be quite self-explanatory:

 <xsl:template match="para">    
   <p><xsl:apply-templates/></p>    
 </xsl:template>    
   
 <xsl:template match="itemizedlist">    
   <ul><xsl:apply-templates/></ul>    
 </xsl:template>    
   
 <xsl:template match="listitem">    
   <li><xsl:apply-templates/></li>    
 </xsl:template>

Click here to download the completed docbook.xsl file if you have any doubts about how this should all fit together, then view the docbook.xml file one more time in your favorite browser, this time with the full complement of templates. It should display as shown in Fig. 7.

The Fully Styled ArticleFig. 7: The Fully Styled Article

If you're interested in seeing an even more stylish version of your document, the DocBook Open Repository contains an official XSL stylesheet distribution that aims to support and format all of the tags defined in the DocBook standard. If you're interested, download the latest stable version and see what the article looks like with that stylesheet applied.

If you liked this article, share the love:
Print-Friendly Version Suggest an Article

Sponsored Links