Article
HTML or XHTML: Does it Really Matter?
Page: 1 2
Arise HTML 5?
One of the primary aims of HTML 5 is to define how exceptions should be handled, so that malformed or invalid markup will be treated in a predictable way. Unlike previous version of HTML, the specification is anchored firmly in real-world implementations, based as it is on observation of content and implementations that are already out there, and will not be considered final until there are at least two interoperable implementations.
HTML 5 syntax is compatible with both HTML 4 and XHTML 1. HTML 5 documents that use HTML 4 syntax must be served as text/html, while those that use XHTML syntax must be served as XML. The DOCTYPE has also been greatly simplified, and is used only to switch a browser into standards mode (rather than to refer to a DTD); the DOCTYPE is not required for XML documents, which are always rendered in standards mode.
And, perhaps most significantly for authors, the specification adds a whole new raft of elements, attributes, and scriptable APIs. These include:
- new structural content elements, such as
<article>,<section>,<header>, and<footer> - new embedding elements, such as
<figure>,<audio>, and<video> - new semantics for common data structures, such as
<time>and<datagrid> - elements designed specifically for building web applications, such as
<output>(for the output of a scripted process),<progress>(for showing the progress of a long process), and<event-source>(used for handling server-sent events), as well as a range of new<input>types, such asdatetime,range,email, andurl - a range of new scripting methods for addressing documents and embedded content, such as a 2D-drawing API for the
<canvas>element, a drag-and-drop API for thedraggableattribute, and additional DOM methods likegetElementsByClassName()andgetSelection()
However, along with these additions are a number of controversial removals, many of which are concerned with accessibility features—think of the alt attribute of <img>, and the summary and headers attributes of <table> markup. The main rationale for removing these is that in practice they are barely used, or barely used correctly. I contend that this isn’t a good enough reason—these are valuable and necessary accessibility features, and removing them without a specific good reason is not okay. Sure, it’s a shame that these attributes are so seldom used correctly, but rather than removing them, it provides a reason for better educating developers on their correct use.
NOTE:
Within the HTML 5 working group, the discussions about this seem to be mostly focused around the needs of authoring tools. But frankly, I don’t see why we should care about their needs. Those who’ve shown a determination to work to standards don’t need to be persuaded about their value; those that haven’t are not going to be persuaded now. The commercial agendas of individual companies should not in any way inform the standards-making process.
Conversely, some presentational elements and attributes have been retained because they’re so commonly used in practice, effectively sanctioning the use of non-semantic markup. These include <hr>, <b>, and <small>. Yet at the same time, other elements such as <big> and <center> have been removed on the basis that they’re purely presentational. I’m not sure how this distinction was arrived at other than by reference to popular usage, but it’s bogus in my view. When defining a specification, we should consider the usefulness and relevance of a particular piece of markup—not ratify the incorrect ways in which people are already using it.
The situation for <table> markup is especially disappointing. Since the headers attribute has been removed, there’s no longer any way to describe the internal structure of a complex table for assistive technologies, where scope isn’t enough. Many developers have written about this issue, most notably Gez Lemon.
The specification aims to provide a “focus on accessibility as a built-in concept for new features,” which implies a desire to have accessibility baked in. But how does removing accessibility features—and not replacing them with alternatives—achieve that aim? HTML 5 does not have accessibility baked in; it barely considers it as an afterthought, because its true emphasis is on providing semantics for visual web application interfaces, rather than structured and mode-independent documents.
In many ways HTML 5 is an exciting development, as it offers a huge and comprehensive range of new semantics and APIs. This has to be a good thing—we’ve outgrown HTML 4 and it’s no longer fit for our purposes. However, the lack of serious focus on accessibility, the over-emphasis on the needs of authoring tools and RPC applications, and the excessively pragmatic attempt to sanction existing bad practices are all causes for concern.
But what is particularly interesting, I think, is how HTML 5 came about in the first place. The W3C didn’t initiate it; rather, it was drafted and developed by an independent group called WHATWG (Web Hypertext Application Technology Working Group) and only later embraced by the W3C.
I find it extremely pertinent to note how such a major development was beyond the vision of the W3C and had to be kicked into life independently. We saw the same situation with microformats, and both of these instances suggest the W3C has grown incapable of innovating. This stagnation is possibly a facet of its excessive bureaucracy—a tendency for all large and established organizations.
Avast XHTML 2?
It seems unlikely that XHTML 2 will gain serious traction. It is implementations that make or break a technology—and nobody (not even Mozilla) seems interested in implementing XHTML 2.
There are some good ideas behind XHTML 2—particularly, in my view, the ability for any element to take a src attribute—but it’s insanely complicated, and requires absolutely that its documents be served as XML. XHTML 2 isn’t designed to be backwardly compatible, and although that sits well in an academic sense, it’s a minefield of problems in the real world. Just one problem is that site owners would have to perform live transforms of XHTML 2 into XHTML 1 during the transitional period.
But here’s the thing: XHTML 1 was supposed to be that transitional period! Yet without ubiquitous uptake we can never move beyond it, so it appears to be highly unlikely we ever will. XHTML 2 is still a Working Draft, and will probably never progress—what’s the point?
Conclusion
This article has sought to show that the move back from XHTML 1 to HTML 4 was a retrograde step. I’ve demonstrated how there is value in using XHTML, even if only some browsers can truly benefit from it, and this continues to underpin my belief that XHTML is better than HTML 4.
I have my concerns about HTML 5, but am nonetheless impressed and excited by the innovation. I look forward to a time when we can actually use it, which for me means that the specification is stable and all major browsers implement the larger part of it. But long before that time can come about, HTML 5 needs to offer a decent level of accessibility markup, which currently it does not. If the issues enumerated earlier are not resolved, I may consider ignoring HTML 5 and sticking with XHTML 1.
Either way, I will continue to use XHTML syntax and reap the benefits of XML wherever possible. And that’s my advice to you.