Article

Home » Client-side Coding » JavaScript & Ajax Tutorials » Accessible JavaScript: Beyond the Mouse
SitePoint Feature Article

About the Author

James Edwards

author_jamesE James (aka brothercake) is a front-end developer based in the UK, specializing in advanced JavaScript programming and accessible web site development. He is an outspoken advocate of standards-based development, and contributing author to SitePoint's The Art & Science of JavaScript.

View all articles by James Edwards...

Accessible JavaScript: Beyond the Mouse

By James Edwards

December 13th, 2006

Reader Rating: 9

Page: 1 2 Next

In my last article for SitePoint, I questioned whether AJAX scripting techniques can be made accessible to screen readers, and discovered that, for the most part, they can't. It's disappointing to do that -- to point out a problem and not be able to offer any answers. But I really had no choice, because as far as I could tell, there weren't any concrete solutions to offer. (Although since then other developers have pushed the envelope further; of particular significance is the work that Gez Lemon and Steve Faulkner are doing in this area.)

But accessibility isn't always difficult! I'm very sensitive to the fact that it's seen by many people as a load of problems, when in fact accessibility is merely another design challenge that, in general, is no more difficult or problematic than any other. AJAX is a particularly awkward example. Most of the time, though, providing for accessibility really isn't that hard.

You can't always get what you want; but if you try sometimes, you just might find, you get what you need.

-- Rolling Stones

In this article, I'd like to provide a little gratification to those attempting to make their web applications accessible. To achieve this, I'll talk about some of the more basic, solvable issues relating to JavaScript accessibility, as we take an introduction to device-independent scripting.

Keyboard Navigation?

Most of us use a mouse for the majority of our graphic interface navigation, but some people can't, and must therefore navigate using the keyboard instead. For a person who has a hand tremor, for example, the precision control required to use a mouse effectively may simply be impossible. For users of assistive technologies such as screen readers, the keyboard is the primary method of interaction. After all, it's rather difficult to use a mouse when you can't see the pointer!

Providing for keyboard access also creates better usability, because many people who can use a mouse nonetheless prefer to use a keyboard for certain tasks or at certain times. These tend to be power users -- people who are generally more familiar with how their computers work, and expect to be able to interact with functionality using either the mouse or the keyboard as their needs dictate.

If you're not in the habit of navigating sites with the keyboard, try it now! Spend some time on your own site, and on other sites you visit regularly, to get a feel for what it's like to surf without a mouse. Discover where difficulties arise, and think about how those issues could be avoided.

Device Independence!

Referring to "keyboard" access is ever-so-slightly misleading, because it's not just the keyboard we're talking about per se. We're talking about trying to provide for device independence, so that whatever a user's mode of interaction, they're able to use a script.

Mouse events, for example, may not be generated by a mouse at all. They might arise from the movement of a trackball, or the analog stick on a handheld gaming console. Focus events might be generated by a keyboard user navigating with the Tab key, or as the result of navigation commands spoken by an Opera user making use of the browser's voice control functionality.

In theory, we would like to be able to support any mode of interaction, regardless of the input device. But in practice, all these forms of interaction generally boil down to one of two basic types: "mouse" (clicking on, or moving an interface element) and "keyboard" (providing input or instructions via character input). These deal with two fairly discreet subsets of the events exposed by the browser, ignoring the majority of programmatic events (loading, errors, etc).

Three Pillars

I'm going to assume that you're already quite familiar with scripting for mouse events, and look only at scripting for keyboard events. (If you need an introduction to events, and a detailed coverage of the real-world use of modern JavaScript techniques, you might like to check out my book.) To that end, there are three core things that I want to discuss -- three "pillars" you might say -- that together provide a foundation for device independence:

  1. Provide accessible interactive elements.

  2. Choose appropriate trigger elements.

  3. Aim to pair scripting hooks, not event hooks. These terms may not make sense now, but will by the time you finish reading this article.

I'd also like you to bear in mind, as we go through these points, that catering to accessibility is about providing equivalence, which is not the same as equality. It doesn't necessarily matter if we provide different paths for different users, so long as everyone has a path to an equivalent end result.
When we look at some practical examples later on, we'll see how even radically different approaches can result in solid equivalence overall.

Providing Accessible Interactive Elements

First and foremost, if we want to capture input from the keyboard, we'll need to use elements that can accept the focus: primarily links (<a>) and form controls (<input>, <select>, <textarea> and <button>). Note that it's also possible to assign focus to the <area> elements in an image-map, a <frame> or <iframe>, in some cases an <object> (depending on what type of data it embeds), and in most browsers, the document or documentElement itself.

The only events we can handle for these interactions are events that the keyboard can actually generate: primarily focus, blur (triggered when the currently focused element looses focus), click (activating a link or button with the keyboard is programmatically the same as clicking it with a mouse), and the three key-action events, keydown, keyup and keypress.

In addition to these direct input events, we can use programmatic events -- that is, events that fire indirectly in response to state changes. Examples of programmatic events include the infamous window.onload event and the onreadystatechange event of an XMLHttpRequest object.

We can also use events that are mode independent, i.e. those for which the user's mode of interaction doesn't have any effect on how or when they fire, such as the submit event of a form.

However -- and this is a significant caveat -- that doesn't mean we have to consign mouse-specific events to the trash, nor relegate non-focusable elements to the sidelines altogether. It just means we'll have to rethink our approach to some tasks. Remember, it's about equivalence, not equality. All paths are good, so long as every user can access at least one of them.

Choosing Appropriate Trigger Elements

I'm using the term "trigger element" to refer to any element that's used to trigger a behavioral response. A trigger element is something a user interacts with in order to cause something else to happen. It could be a simple link to "Add a tag" to a photo on flickr:

1559_flickr1

Or it could comprise a series of icons at the top a photo, designed to allow users to perform actions like adding a photo to their favorites:

1559_flickr2

But as we've already noted, the choice of elements we have available to implement these triggers is limited.

Now, the <button> element is a particular favourite of mine because it's so amazingly flexible: it can be styled as much as any other element, it can contain other HTML, it can be enabled or disabled and report that state to user-agents, and it can work as an active trigger element without having a value. However, like all <form> elements, its only valid context is inside a <form>.

By contrast, the problem with using links as triggers is that although you can have them appear any way you like, they always have to have a value of some kind: a link with nothing in its href attribute is not accessible to the keyboard.

The generally accepted best practice is to use progressive enhancement -- include a default href attribute that points to equivalent, non-scripted functionality -- but that's not necessarily appropriate when we're working in an entirely scripted environment (for example, in dealing with a link which itself was generated with scripting, in an application that caters to non-script users elsewhere). This situation often results in the need for links to have "#" or "javascript:void(null)", or a similar -- essentially junk -- href.

All of this is somewhat beside the point, though, as our choice of element should be based on what the trigger actually is, and on what it does. We can't just use a <button> for convenience, and to avoid the problem with links, or vice versa. We have to consider semantics, and try to make sure that a trigger element is what it appears to be, and that its appearance is consistent with its function.

This is not always easy; the flickr icons example is a particularly tricky one. Let's look at that again:

1559_flickr2

The overall appearance of these icons suggests that they're buttons, like the toolbar buttons in Photoshop or MS Office. But functionally speaking, the first three are scripted actions, while the last one is actually a link to another page.

So, should the first three be <button> elements while the last is an <a>? Maybe "all sizes" should be a separate link that's not part of this toolbar at all?

What about the "Add a tag" link?

1559_flickr1

Shouldn't that be -- and look like -- a button, since it's a scripted action, not a page view? (And, while we're at it, shouldn't it do something if JavaScript is not available ...?)

Perhaps the overall conclusion in this case is that flickr's interface design, like so much of the Web 2.0 genre, is just a little haphazard and not properly thought through.

But all of this really does matter -- semantics aren't just an exercise in navel-gazing. The choice of elements matters a great deal to user agents, as they depend on markup semantics to identify what the content is, which in turn, matters to ordinary users hoping to use that content effectively.

In case you still feel that this is nothing more than an academic discussion of semantic purity, let's look at a practical example of why trigger element choice matters in the real world: Opera's keyboard navigation.

Opera uses different keys for navigating form elements than it does for navigating links (form elements use the Tab key, while link navigation uses "A" and "Q" for "next." and "previous anchor" respectively). So if we use interface elements that look like buttons for links, or vice versa, we'll create a cognitive and usability problem for Opera users who navigate using the keyboard.
As another example, let's examine what Basecamp does in its Writeboard application:

1559_writeboard

"Edit this page" looks like a button, so we should be able to Tab to it just like any other; but we can't, because it isn't a button at all. It's a styled link.

Perhaps it should be a <button> after all, since that's what it looks like. Or should it just be (and look like) a simple link, since what it actually does is load a whole new page? In this case, I think the latter.

Like I said, this aspect is not always easy, but it has to be considered if an application is to be as intuitive with the keyboard as it is with the mouse. In general, I think that links should be used for actions that load a new page without posting any data (i.e. GET requests), and that buttons or other appropriate form widgets should be used for everything else. (What is an application, after all, other than a complex form?). This view is echoed by the HTTP 1.1 specification, which states that GET requests should not be used for actions that will change a resource, such as deleting, creating, or updating content.

But in all cases, a trigger element must look like what it is.