The WHATWG Blog — Mark Pilgrim, Google

Author Archive

This Week in HTML 5 – Episode 3

Friday, August 22nd, 2008

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The biggest news this week is the birth of the event loop.

To coordinate events, user interaction, scripts, rendering, networking, and so forth, user agents must use event loops as described in this section.

... An event loop has one or more task queues. A task queue is an ordered list of tasks, which can be:

Events

Asynchronously dispatching an Event object at a particular EventTarget object is a task.

Parsing

The HTML parser tokenising a single byte, and then processing any resulting tokens, is a task.

Callbacks

Calling a callback asynchronously is a task.

Using a resource

When an algorithm fetches a resource, if the fetching occurs asynchronously then the processing of the resource once some or all of the resource is available is a task.

Reacting to DOM manipulation

Some elements have tasks that trigger in response to DOM manipulation, e.g. when that element is inserted into the document.

The purpose of defining an event loop is to unify the definition of things that happen asychronously. (I want to avoid saying "events" since that term is already overloaded.) For example, if an image defines an onload callback function, exactly when does it get called? Questions like this are now answered in terms of adding tasks to a queue and processing them in an event loop.

Revision 2074 defines event loops and task queues (as quoted above).
Revision 2076, 2079, 2080, 2081, 2082, and 2083 define the behavior of media elements (like <audio> and <video>) in terms of the event loop.
Revision 2084 defines the behavior of template and ref attributes, local database storage, and remote events in terms of the event loop.
Revision 2085 defines the behavior of web sockets, postMessage, message ports, and setTimeout in terms of the event loop.
Revision 2097 defines the behavior of an image's load event in terms of the event loop.

The other major news this week is the addition of the hashchange event, which occurs when the user clicks an in-page link that goes somewhere else on the same page, or when a script programmatically sets the location.hash property. This is primarily useful for AJAX applications that wish to maintain a history of user actions while remaining on the same page. As a concrete example, executing a search of your messages in GMail takes you to a list of search results, but does not change the base URL, just the hash; clicking the Back button takes you back to the previous view within GMail (such as your inbox), again without changing the base URL (just the hash). GMail employs some nasty hacks to make this work in all browsers; the hashchange event is designed to make those hacks slightly less nasty. Microsoft Internet Explorer 8 pioneered the hashchange event, and its definition in HTML 5 is designed to match Internet Explorer's behavior.

Other interesting changes this week:

In last week's episode, I mentioned revision 2063, which allows HTML documents to contain both xml:lang and lang attributes as long as they are identical. Revision 2091 relaxes this restriction slightly to allow the xml:lang and lang attributes to differ by case (i.e. one could be uppercase and the other could be lowercase, and that is no longer an error). Discussion: xml:lang="" and lang=""
Revision 2092 defines the parsing algorithm for empty table rows.
Revision 2094 clarifies the meaning of whitespace by deferring to the Unicode definitions.
Revision 2096 forbids content sniffing for SVG images. In order to use an SVG image in an <img src=""> attribute, the web server must ensure that the SVG image is served with a Content-Type: image/svg+xml HTTP header.

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 4 Comments »

This Week in HTML 5 – Episode 2

Thursday, August 14th, 2008

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The biggest news this week is revision 2020, which standardizes the navigator object:

The navigator attribute of the Window interface must return an instance of the Navigator interface, which represents the identity and state of the user agent (the client), and allows Web pages to register themselves as potential protocol and content handlers.

Currently, HTML 5 defines four properties and two methods:

appName
appVersion
platform
userAgent
registerProtocolHandler
registerContentHandler

This is only a subset of navigator properties and methods that browsers already support. See Navigator Object on Google Doctype for complete browser compatibility information.

Next up: Content-Language. No, not the HTTP header, not even the <html lang> attribute, but the <meta> tag! As reported by Henri Sivonen,

It seems that some authoring tools and authors use <meta http-equiv='content-language' content='languagetag'> instead of <html lang='languagetag'>.

This led to revision 2057, which defines the <meta> http-equiv="Content-Language"> directive and its relationship with lang, xml:lang, and the Content-Language HTTP header.

In the continuing saga of the alt attribute, the new syntax for alternate text of auto-generated images (which I covered in last week's episode) has generated some followup discussion. Philip Taylor is concerned that it will increase complexity for authoring tools; others feel the complexity is worth the cost. James Graham suggested a no-text-equivalent attribute; similar proposals have been discussed before and rejected.

Switching to the new Web Workers specification (which I also covered last week), Aaron Boodman (one of the developers of Google Gears) posted his initial feedback. This kicked off a long discussion and led to the creation of the Worker object.

Other interesting changes this week:

Revision 2034 and revision 2035 define the outerHTML property, and revision 2040 defines the insertAdjacentHTML method. Both properties originally appeared in Microsoft Internet Explorer 5 (outerHTML on MSDN, insertAdjacentHTML on MSDN).
Revision 2044 disallows scripts executing while an alert is displayed.
Revision 2046 requires that <script src="javascript:"> not execute script (reported by Simon Pieters)
Revision 2063 allows an HTML document to declare xml:lang if and only if it also declares lang, to ease migration between HTML and XHTML. The language values must be identical. (Reported by Simon Pieters.)
Revision 2064 defines the behavior when calling document.open("text/plain"). Re: type parameter of Document.open() (detailed review of the DOM) documents the incompatibilities in existing browsers.
Revision 2066 defines the order for getElementsByName() (reported by Maciej Stachowiak)
Revision 2068 defines window.frameElement
Revision 2069: don't require Document.location to do anything when the Document isn't in a Window (reported by Anne van Kesteren)

Administrivia: "This Week in HTML 5" now has its own feed.

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 3 Comments »

This Week in HTML 5 – Episode 1

Wednesday, August 6th, 2008

Welcome to a new semi-regular column, "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The biggest news is the birth of the Web Workers draft specification. Quoting the spec, "This specification defines an API that allows Web application authors to spawn background workers running scripts in parallel to their main page. This allows for thread-like operation with message-passing as the coordination mechanism." This is the standardization of the API that Google Gears pioneered last year. See also: initial Workers thread, announcement of new spec, response to Workers feedback.

Also notable this week: even more additions to the Requirements for providing text to act as an alternative for images. 4 new cases were added:

A link containing nothing but an image
A group of images that form a single larger image
An image not intended for the user (such as a "web bug" tracking image)
Text that has been rendered to a graphic for typographical effect

Additionally, the spec now tries to define what authors should do if they know they have an image but don't know what it is. Quoting again from the spec:

If the src attribute is set and the alt attribute is set to a string whose first character is a U+007B LEFT CURLY BRACKET character ({) and whose last character is a U+007D RIGHT CURLY BRACKET character (}), the image is a key part of the content, and there is no textual equivalent of the image available. The string consisting of all the characters between the first and the last character of the value of the alt attribute gives the kind of image (e.g. photo, diagram, user-uploaded image). If that value is the empty string (i.e. the attribute is just "{}"), then even the kind of image being shown is not known.

If the image is available, the element represents the image specified by the src attribute.

If the image is not available or if the user agent is not configured to display the image, then the user agent should display some sort of indicator that the image is not being rendered, and, if possible, provide to the user the information regarding the kind of image that is (as derived from the alt attribute).

Other interesting changes this week:

revision 1951: define window.top
revision 1956: "User agents must not run executable code embedded in the image resource."
revision 1958: more notes on what is a valid image (a surprisingly difficult question)
revision 1965: allow <a> elements to straddle paragraphs
revision 1998: define what happens when you set onclick='' on a document outside a Window
revision 1999: define javascript: in Window-less environments
revision 2001: define 'directionality' in terms of the dir='' attribute for cases where the 'direction' property has no computed value
revision 2002: define processing for the second argument to getDataURL() for image/jpeg
revision 2004: specify how to handle transparent images in the toDataURL() method
revision 2008: make patterns required in the <canvas> API
revision 2016: when <script type=''> is given, it must match the type of the script, even if the script is Javascript
revision 2019: remove autosubmit='' from the <menu> element

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Processing Model, Weekly Review, WHATWG | 21 Comments »

The `longdesc` lottery

Friday, September 14th, 2007

Let's talk about the longdesc attribute. In HTML 4, it's defined as a pointer to a long description for a complex image. Anyone can learn how to write a good long description. There's only one problem: virtually no one bothers, and virtually everyone who does bother gets it wrong.

Let's quantify that. In August 2007, Ian Hickson analyzed a sample of 1 billion <img> elements in Google's index. Approximately 1.3 million (0.13%) had a longdesc attribute. That's OK, you say, not every image needs a longdesc attribute. And you would be right. But regardless of whether it's needed or not, it's not being used that often: just over one in a thousand images.

Now let's look at how often the longdesc attribute is actually used correctly. Of course this is a more subjective question, but we can spot some obvious errors. Out of those 1.3 million images with a longdesc attribute, let's subtract the ones where the longdesc attribute...

is blank
is not a valid URL
points to the image itself (i.e. the same URL as the src attribute)
points to the page you're already on
points to the root level of another domain
is the same as a parent link's href attribute (i.e. the longdesc is redundant because you could just follow the image link instead)

That knocks out a whopping 1.25 million (about 96%) right off the bat. That's not 96% of all the images on the web; that's 96% of the 0.13% of images that included a longdesc attribute in the first place. And when you take a closer look at the remaining 50,000 (4% of 1.3 million), the results get even worse: links to other images, links gone 404, links to one-line text descriptions identical to the alt attribute, and links to pages that describe the image size but not its contents (Wikipedia, I'm looking at you). Extrapolating back to 1.3 million, that 50,000 shrinks to about 10,000. That means that less than 1% of images that provide a longdesc attribute are actually useful. No more than one in a hundred get it right, of one in a thousand that even try.

Meanwhile, the very people advocating for keeping the longdesc attribute have recently conducted some user testing. That is, testing how well an actual blind person with an actual screen reader can read actual web pages. It turned out that the test subject didn't know that longdesc even existed before the tester told him about it. Can you blame him? 99.87% of the images he'd ever encountered had no longdesc attribute at all. Even if he had known about it, and he had actually stumbled across one, he would still be up against 99 to 1 odds that following it would be worth his time. He has a better chance of winning the lottery.

I'm not saying there isn't a real problem to be solved here. There is. People can publish complex images that require complex text alternatives. Charts, graphs, detailed photographs. Whatever. "A picture is worth 1000 words," and all that. The longdesc attribute is, theoretically, a solution to this problem. But that doesn't mean it's a good solution, and it's certainly not the only solution. We've been living with longdesc for 10 years now, and let me tell you, it's not working out. So can we please get past the grandstanding and start talking about a better solution?

Posted in Elements | 40 Comments »

Author Archive

This Week in HTML 5 – Episode 3

This Week in HTML 5 – Episode 2

This Week in HTML 5 – Episode 1

The longdesc lottery

The `longdesc` lottery