Article

Script Smarter: Quality JavaScript from Scratch

Page: 1 2 3 4 5 6 7 8 Next

Chapter 5. Navigating the Document Object Model

Browsers give JavaScript programs access to the elements on a web page via the Document Object Model (DOM) -- an internal representation of the headings, paragraphs, lists, styles, IDs, classes, and all the other data to be found in the HTML on your page.

The DOM can be thought of as a tree consisting of interconnected nodes. Each tag in an HTML document is represented by a node; any tags that are nested inside that tag are nodes that are connected to it as children, or branches in the tree. Each of these nodes is called an element node. (Strictly speaking, each element node represents a pair of tags—the start and end tags of an element (e.g., <p> and </p>)—or a single self-closing tag (e.g., <br>, or <br/> in XHTML).) There are several other types of nodes; the most useful are the document node, text node, and attribute node. The document node represents the document itself, and is the root of the DOM tree. Text nodes represent the text contained between an element's tags. Attribute nodes represent the attributes specified inside an element's opening tag. Consider this basic HTML page structure:

<html>  
 <head>  
   <title>Stairway to the stars</title>  
 </head>  
 <body>  
   <h1 id="top">Stairway to the stars</h1>  
   <p class="introduction">For centuries, the stars have been  
     more to humankind than just burning balls of gas ...</p>  
 </body>  
</html>

The DOM for this page could be visualized as Figure 5.1, "The DOM structure of a simple HTML page, visualized as a tree hierarchy".

Every page has a document node, but its descendents are derived from the content of the document itself. Through the use of element nodes, text nodes, and attribute nodes, every piece of information on a page is accessible via JavaScript.

The DOM isn't just restricted to HTML and JavaScript, though. Here's how the W3C DOM specification site explains the matter:

The Document Object Model is a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents.

So, even though the mixture of JavaScript and HTML is the most common combination of technologies in which the DOM is utilized, the knowledge you gain from this chapter can be applied to a number of different programming languages and document types.

In order to make you a "master of your DOMain," this chapter will explain how to find any element you're looking for on a web page, then change it, rearrange it, or erase it completely.

The DOM structure of a simple HTML page
Figure 5.1. The DOM structure of a simple HTML page, visualized as a tree hierarchy

Accessing Elements

Access provides control, control is power, and you're a power programmer, right? So you need access to everything that's on a web page. Fortunately, JavaScript gives you access to any element on a page using just a few methods and properties.

Solution

Although it's possible to navigate an HTML document like a road map?starting from home and working your way towards your destination one node at a time?this is usually an inefficient way of finding an element because it requires a lot of code, and any changes in the structure of the document will usually mean that you have to rewrite your scripts. If you want to find something quickly and easily, the method that you should tattoo onto the back of your hand is document.getElementById.

Assuming that you have the correct markup in place, getElementById will allow you immediately to access any element by its unique id attribute value. For instance, imagine your web page contains this code:

Example 5.1. access_element.html (excerpt)  
 
<p>  
 <a id="sirius" href="sirius.html">Journey to the stars</a>  
</p>

You can use the a element's id attribute to get direct access to the element itself:

Example 5.2. access_element.js (excerpt)  
 
var elementRef = document.getElementById("sirius");

The value of the variable elementRef will now be referenced to the a element -- any operations that you perform on elementRef will affect that exact hyperlink.

getElementById is good for working with a specific element; however, sometimes you'll want to work with a group of elements. In order to retrieve a group of elements on the basis of their tag names, you can use the method getElementsByTagName.

As can be seen from its name, getElementsByTagName takes a tag name and returns all elements of that type. Assume that we have this HTML code:

Example 5.3. access_element2.html (excerpt)  
 
<ul>  
 <li>  
   <a href="sirius.html">Sirius</a>  
 </li>  
 <li>  
   <a href="canopus.html">Canopus</a>  
 </li>  
 <li>  
   <a href="arcturus.html">Arcturus</a>  
 </li>  
 <li>  
   <a href="vega.html">Vega</a>  
 </li>  
</ul>

We can retrieve a collection that contains each of the hyperlinks like so:

Example 5.4. access_element2.js (excerpt)  
 
var anchors = document.getElementsByTagName("a");

The value of the variable anchors will now be a collection of a elements. Collections are similar to arrays in that each of the items in a collection is referenced using square bracket notation, and the items are indexed numerically starting at zero. The collection returned by getElementsByTagName sorts the elements by their source order, so we can reference each of the links thus:

anchorArray[0]  
 
   the a element for "Sirius"  
anchorArray[1]  
 
   the a element for "Canopus"  
anchorArray[2]  
 
   the a element for "Arcturus"  
anchorArray[3]  
 
   the a element for "Vega"

Using this collection you can iterate through the elements and perform an operation on them, such as assigning a class using the element nodes' className property:

Example 5.5. access_element2.js (excerpt)  
 
var anchors = document.getElementsByTagName("a");  
 
for (var i = 0; i < anchors.length; i++)  
{  
 anchors[i].className = "starLink";  
}

Unlike getElementById, which may be called on the document node only, the getElementsByTagName method is available from every single element node. You can limit the scope of the getElementsByTagName method by executing it on a particular element. getElementsByTagName will only return elements that are descendents of the element on which the method was called.

If we have two lists, but want to assign a new class to the links in one list only, we can target those a elements exclusively by calling getElementsByTagName on their parent list:

Example 5.6. access_element3.html (excerpt)  
 
<ul id="planets">  
 <li>  
   <a href="mercury.html">Mercury</a>  
 </li>  
 <li>  
   <a href="venus.html">Venus</a>  
 </li>  
 <li>  
   <a href="earth.html">Earth</a>  
 </li>  
 <li>  
   <a href="mars.html">Mars</a>  
 </li>  
</ul>  
<ul id="stars">  
 <li>  
   <a href="sirius.html">Sirius</a>  
 </li>  
 <li>  
   <a href="canopus.html">Canopus</a>  
 </li>  
 <li>  
   <a href="arcturus.html">Arcturus</a>  
 </li>  
 <li>  
   <a href="vega.html">Vega</a>  
 </li>  
</ul>

To target the list of stars, we need to obtain a reference to the parent ul element, then call getElementsByTagName on it directly:

Example 5.7. access_element3.js (excerpt)  
 
var starsList = document.getElementById("stars");  
var starsAnchors = starsList.getElementsByTagName("a");

The value of the variable starsAnchors will be a collection of the a elements inside the stars unordered list, instead of a collection of all a elements on the page.

DOM 0 Collections

Many "special" elements in an HTML document can be accessed by even more direct means. The body element of the document can be accessed as document.body. A collection of all the forms in a document may be found in document.forms. All of the images in a document may be found in document.images.

In fact, most of these collections have been around since before the DOM was standardized by the W3C, and are commonly referred to as DOM 0 properties.

Because the initial implementations of these features were not standardized, these collections have occasionally proven unreliable in browsers that are moving towards standards compliance. Early versions of some Mozilla browsers (e.g., Firefox), for example, did not support these collections on XHTML documents.

Today's browsers generally do a good job of supporting these collections; however, if you do run into problems, it's worth trying the more verbose getElementsByTagName method of accessing the relevant elements. Instead of document.body, for example, you could use:

var body = document.getElementsByTagName("body")[0];

Discussion

If you really need to step through the DOM hierarchy element by element, each node has several properties that enable you to access related nodes:

  • node.childNodes - a collection that contains source-order references to each of the children of the specified node, including both elements and text nodes
  • node.firstChild - the first child node of the specified node
  • node.lastchild - the last child node of the specific node
  • node.parentNode - a reference to the parent element of the specified node
  • node.nextSibling - the next node in the document that has the same parent as the specified node
  • node.previousSibling - the previous element that's on the same level as the specified node

If any of these properties do not exist for a specific node (e.g., the last node of a parent will not have a next sibling), they will have a value of null.

Take a look at this simple page:

Example 5.8. access_element4.html (excerpt)  
 
<div id="outerGalaxy">  
 <ul id="starList">  
   <li id="star1">  
     Rigel  
   </li>  
   <li id="star2">  
     Altair  
   </li>  
   <li id="star3">  
     Betelgeuse  
   </li>  
 </ul>  
</div>

The list item with ID star2 could be referenced using any of these expressions:

/document.getElementById("star1").nextSibling;  
document.getElementById("star3").previousSibling;  
document.getElementById("starList").childNodes[1];  
document.getElementById("star1").parentNode.childNodes[1];

Whitespace Nodes

Some browsers will create whitespace nodes between the element nodes in any DOM structure that was interpreted from a text string (e.g., an HTML file). Whitespace nodes are text nodes that contain only whitespace (tabs, spaces, new lines) to help format the code in the way it was written in the source file.

When you're traversing the DOM node by node using the above properties, you should always allow for these whitespace nodes. Usually, this means checking that the node you've retrieved is an element node, not just a whitespace node that's separating elements.

There are two easy ways to check whether a node is an element node or a text node. The nodeName property of a text node will always be "#text", whereas the nodeName of an element node will identify the element type. However, in distinguishing text nodes from element nodes, it's easier to check the nodeType property. Element nodes have a nodeType of 1, whereas text nodes have a nodeType of 3. You can use this knowledge as a test when retrieving elements:

Example 5.9. access_element4.js (excerpt)  
 
var star2 = document.getElementById("star1").nextSibling;  
 
while (star2.nodeType == "3")  
{  
 star2 = star2.nextSibling;  
}

Using these DOM properties, it's possible to start your journey at the root html element, and end up buried in the legend of some deeply-nested fieldset?it's all just a matter of following the nodes.

Creating Elements and Text Nodes

JavaScript doesn't just have the ability to modify existing elements in the DOM; it can also create new elements and place them anywhere within a page's structure.

Solution

createElement is the aptly named method that allows you to create new elements. It only takes one argument -- the type (as a string) of the element you wish to create -- and returns a reference to the newly-created element:

Example 5.10. create_elements.js (excerpt)  
 
var newAnchor = document.createElement("a");

The variable newAnchor will be a new a element, ready to be inserted into the page.

Specifying Namespaces in Documents with an XML MIME Type

If you're coding JavaScript for use in documents with a MIME type of application/xhtml+xml (or some other XML MIME type), you should use the method createElementNS, instead of createElement, to specify the namespace for which you're creating the element:

var newAnchor = document.createElementNS(  
   "http://www.w3.org/1999/xhtml", "a");

This distinction applies to a number of DOM methods, such as removeElement/removeElementNS and getAttribute/getAttributeNS; however, we won't use the namespace-enhanced versions of these methods in this book.

Simon Willison provides a brief explanation of working with JavaScript and different MIME types on his web site.

The text that goes inside an element is actually a child text node of the element, so it must be created separately. Text nodes are different from element nodes, so they have their own creation method, createTextNode:

Example 5.11. create_elements.js (excerpt)  
 
var anchorText = document.createTextNode("monoceros");

If you're modifying an existing text node, you can access the text it contains via the nodeValue property. This allows you to get and set the text inside a text node:

var textNode = document.createTextNode("monoceros");  
var oldText = textNode.nodeValue;  
textNode.nodeValue = "pyxis";

The value of the variable oldText is now "monoceros", and the text inside textNode is now "pyxis".

You can insert either an element node or a text node as the last child of an existing element using its appendChild method. This method will place the new node after all of the element's existing children.

Consider this fragment of HTML:

Example 5.12. create_elements.html (excerpt)  
 
<p id="starLinks">  
 <a href="sirius.html">Sirius</a>  
</p>

We can use DOM methods to create and insert another link at the end of the paragraph:

Example 5.13. create_elements.js (excerpt)  
 
var anchorText = document.createTextNode("monoceros");  
 
var newAnchor = document.createElement("a");  
newAnchor.appendChild(anchorText);  
 
var parent = document.getElementById("starLinks");  
var newChild = parent.appendChild(newAnchor);

The value of the variable newChild will be a reference to the newly inserted element.

If we were to translate the state of the DOM after this code had executed into HTML code, it would look like this:

<p id="starLinks">  
 <a href="sirius.htm">Sirius</a><a>monoceros</a>  
</p>

We didn't specify any attributes for the new element, so it doesn't link anywhere at the moment. The process for specifying attributes is explained shortly in the section called "Reading and Writing the Attributes of an Element".

Discussion

There are three basic ways by which a new element or text node can be inserted into a web page. The approach you use will depend upon the point at which you want the new node to be inserted: as the last child of an element, before another node, or as the replacement for a node. The process of appending an element as the last child was explained above. You can insert the node before an existing node using the insertBefore method of its parent element, and you can replace a node using the replaceChild method of its parent element.

In order to use insertBefore, you need to have references to the node you're going to insert, and to the node before which you wish to insert it. Consider this HTML code:

Example 5.14. create_elements2.html (excerpt)  
 
<p id="starLinks">  
 <a id="sirius" href="sirius.html">Sirius</a>  
</p>

We can insert a new link before the existing one by calling insertBefore from its parent element (the paragraph):

Example 5.15. create_elements2.js (excerpt)  
 
var anchorText = document.createTextNode("monoceros");  
 
var newAnchor = document.createElement("a");  
newAnchor.appendChild(anchorText);  
 
var existingAnchor = document.getElementById("sirius");  
var parent = existingAnchor.parentNode;  
var newChild = parent.insertBefore(newAnchor, existingAnchor);

The value of the variable newChild will be a reference to the newly inserted element.

If we were to translate into HTML the state of the DOM after this operation, it would look like this:

<p id="starLinks">  
 <a>monoceros</a><a id="sirius" href="sirius.htm">Sirius</a>  
</p>

Instead, we could replace the existing link entirely using replaceChild:

Example 5.16. create_elements3.js (excerpt)  
 
var anchorText = document.createTextNode("monoceros");  
 
var newAnchor = document.createElement("a");  
newAnchor.appendChild(anchorText);  
 
var existingAnchor = document.getElementById("sirius");  
var parent = existingAnchor.parentNode;  
var newChild = parent.replaceChild(newAnchor, existingAnchor);

The DOM would then look like this:

<p id="starLinks">  
 <a>monoceros</a>  
</p>

If you liked this article, share the love:
Print-Friendly Version Suggest an Article

Sponsored Links