The WHATWG Blog — accessibility

This Week in HTML 5 – Episode 17

Tuesday, December 30th, 2008

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The big news this week is a major revamp of table headers, following up from the last major edits last March. Ian summarizes the most recent round of changes:

Header cells can now themselves have headers.

I have reversed the way the algorithm is presented, such that it starts from a cell and reports the headers rather than generating the list of headers for each cell on a header-by-header basis.

If headers="" points to a <td> element, the association is set up, but I have left this non-conforming to help authors catch mistakes.

Header cells that are automatically associating do not stop associating when they hit equivalent cells unless they have also hit a <td> first.

The "col" and "row" scope values now act like the implied auto value except that they force the direction.

Empty header cells don't get automatically associated.

I have removed the wide header cell heuristic.

I have made headers="" use the same ID discovery mechanism as getElementById(), to avoid implementations having to support multiple such mechanisms.

Finally, I have made the spec define if a header is a column header or a row header in the case where scope="" is omitted.

I haven't added summary="" on table; nothing particularly new has been raised on the topic since the last times I looked at this.

Accessibility advocates are disappointed by the continued non-inclusion of the summary attribute. Their reasoning is that "the summary attribute is a very, very practical and useful attribute," despite their own user testing that shows otherwise. As Ian put it, "I am hesitant to include a feature like summary="" when all evidence seems to point to it being widely misused by authors and ignored by the users it intends to help." As with all issues, this is not the final word on the matter, but it's where we stand today.

In other news, r2566 addresses a very subtle issue with fetching images. The problem stems from the following (arguably pointless) markup: <img src=""> A fair number of web pages actually try to declare an image with an empty src attribute. According to the HTTP and URL specifications, this markup means that there is an image at the same address as the HTML document -- a theoretically possible but highly unlikely scenario. Internet Explorer apparently catches this mistake and just silently drops the image. Other browsers do not; they will actually try to fetch the image, which results in a "duplicate" request for the page (once to successfully retrieve the page, and again to unsuccessfully retrieve the image).

Boris Zbarsky, a leading Mozilla developer, states

We (Gecko) have had 28 independent bug reports filed (with people bothering to create an account in the bug database, etc) about the behavior difference from IE here. That's a much larger number of bug reports than we usually get about a given issue. I can't tell you why this pattern is so common (e.g. whether some authoring frameworks produce it in some cases), but it seems that a number of web developers not only produce markup like this but notice the requests in their HTTP logs and file bugs about it.

r2566 addresses the issue by special-casing <img src> to allow browsers to ignore an image if its fetch request would result in fetching exactly the same URL as its HTML document:

When an img is created with a src attribute, and whenever the src attribute is set subsequently, the user agent must fetch the resource specifed by the src attribute's value, unless the user agent cannot support images, or its support for images has been disabled, or the user agent only fetches elements on demand, or the element's src attribute has a value that is an ignored self-reference.

The src attribute's value is an ignored self-reference if its value is the empty string, and the base URI of the element is the same as the document's address.

Other interesting tidbits this week:

r2568 adds a storageArea attribute to StorageEvent object. [StorageEvent deficiency]
r2556 changes the processing model of the <meta charset> attribute by requiring that it appear in the first 512 bytes of the document. For those of you playing along at home, <meta charset="..."> is the new <meta http-equiv="Content-Type" content="text/html; charset=...">. Both forms are fully supported in all major browsers. [Comparing conformance requirements against real-world docs]
r2557, r2559, r2560, r2562, r2563, and r2604 add a variety of common markup errors to the list of errors that HTML validators may treat as minor. [Re: comparing conformance requirements against real-world docs]
r2561 allows the height and width attributes on <input type="image">, a construct that is already supported by all major browsers. [Re: comparing conformance requirements against real-world docs]
r2601 adds an example of something that all browsers do anyway -- killing scripts that run too long.
r2597 removes the notification API, which was kicked around in 2006 but never saw significant interest from either authors or browser vendors. [Notifications API removed]
r2596 defines window.close(), window.focus(), and window.blur(). The focus() and blur() methods have historically been used to produce "pop-up" and "pop-under" windows containing advertisements. Most modern browsers now control how and whether scripts can do this, and the HTML 5 specification goes so far as to recommend that "[u]ser agents are encouraged to ignore calls to this blur() method entirely."
r2552 gives an example of embedding RDF metadata in XHTML. As the spec notes, this is not possible in HTML, although you could always use RDFa.
r2595 gives an example of marking up a tag cloud.

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 12 Comments »

This week in HTML 5 – Episode 16

Thursday, December 18th, 2008

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The big news this week is r2529, which makes so many changes that I had to ask Ian to explain it to me. This is what he said:

Someone asked for onbeforeunload, so I started fixing it. Then I found that there was some rot in the drywall. So I took down the drywall. Then I found a rat infestation. So I killed all the rats. Then I found that the reason for the rot was a slow leak in the plumbing. So I tried fixing the plumbing, but it turned out the whole building used lead pipes. So I had to redo all the plumbing. But then I found that the town's water system wasn't quite compatible with modern plumbing techniques, and I had to dig up the entire town. And that's basically it.

"Amusing, in a quiet way," said Eeyore, "but not really helpful."

Basically, the way that scripts are defined has changed dramatically. Not in an terribly incompatible way, just a clearer definition that paves the way for better specification of certain properties of script (and noscript). Let's start with the new definition of a script:

A script has:

A script execution environment

The characteristics of the script execution environment depend on the language, and are not defined by this specification.

In JavaScript, the script execution environment consists of the interpreter, the stack of execution contexts, the global code and function code and the Function objects resulting, and so forth.

A list of code entry-points

Each code entry-point represents a block of executable code that the script exposes to other scripts and to the user agent.

Each Function object in a JavaScript script execution environment has a corresponding code entry-point, for instance.

The main program code of the script, if any, is the initial code entry-point. Typically, the code corresponding to this entry-point is executed immediately after the script is parsed.

In JavaScript, this corresponds to the execution context of the global code.

A relationship with the script's global object

An object that provides the APIs that the code can use.

This is typically a Window object. In JavaScript, this corresponds to the global object.

When a script's global object is an empty object, it can't do anything that interacts with the environment.

A relationship with the script's browsing context

A browsing context that is assigned responsibility for actions taken by the script.

When a script creates and navigates a new top-level browsing context, the opener attribute of the new browsing context's Window object will be set to the script's browsing context's Window object.

A character encoding

A character encoding, set when the script is created, used to encode URLs. If the character encoding is set from another source, e.g. a document's character encoding, then the script's character encoding must follow the source, so that if the source's changes, so does the script's.

A base URL

A URL, set when the script is created, used to resolve relative URLs. If the base URL is set from another source, e.g. a document base URL, then the script's base URL must follow the source, so that if the source's changes, so does the script's.

Membership in a script group

A group of one or more scripts that are loaded in the same context, which are always disabled as a group. Scripts in a script group all have the same global object and browsing context.

A script group can be frozen. When a script group is frozen, any code defined in that script group will throw an exception when invoked. A frozen script group can be unfrozen, allowing scripts in that script group to run normally again.

The most interesting part of this new definition is the script group, a new concept which now governs all scripts. When a Document is created, it gets a fresh script group, which contains all the scripts that are defined (or are later created somehow) in the document. When the user navigates away from the document, the entire script group is frozen, and browsers should not execute those scripts anymore. This sounds like an obvious statement if you think of documents as individual browser windows (or tabs), but consider the case of a document with multiple frames, or one with an embedded iframe. Suppose that the user clicks some link within the iframe that only navigates to a new URL within the iframe (i.e. the parent document stays the same). The parent document may have some reference to functions defined in the old iframe. Should it still be able to call these functions? IE says no; other browsers say yes. HTML 5 now says no, because when the iframe navigates to a new URL, the old iframes script group is frozen -- even if there are active references to those scripts (say, from the parent document), browsers shouldn't allow the page to execute them.

The main benefit of this new concept of script groups is that it removes a number of complications faced by the non-IE browsers. For example, it prevents the problem of scripts suddenly discovering that their global object is no longer the object that they think of as the Window object. Script groups are also frozen when calling document.open(). Freezing script groups also defines the point at which timers and other callbacks are reset, which is something that previous versions of HTML had never defined.

And after all of this ripping up and redefining, HTML 5 now defines the onbeforeunload event, which is already supported by major browsers.

Other interesting tidbits this week:

r2533 adds support for passing structured data between documents with postMessage(). [structured data discussion]
r2536 defines the NameCreator, NameDeleter, NameGetter, NameSetter, IndexGetter, and IndexSetter anonymous methods, which are used by browsers internally to manage lists of named or indexed properties (e.g. form.elements, per-element custom data attributes, or the pixel data of a canvas).
r2537 explains that you can not click something while you're already in the process of clicking it. (Technically speaking, it makes the click() method non-reentrant.) [nested click() discussion]
r2538 clarifies that non-interactive elements that are not usually focusable, but that do currently have focus (via the tabindex attribute), should simulate onclick events when the user presses ENTER. This may seem like a minor point, but it is important for building keyboard-accessible web applications. [onclick discussion]
r2539 notes that buttons (and their values) are not submitted with other form data unless they were the button that submitted the form. [button submission discussion]
Silvia Pfeiffer posts thoughts on video accessibility and links to this collection of video accessibility requirements on the Mozilla wiki.
Nine years in the making, the second major edition of the Web Content Accessibility Guidelines is now officially a W3C Recommendation. The guidelines are supplemented by a comprehensive techniques document, for example Using alt attributes on img elements. HTML 5 also includes a section on using the alt attribute, but in general you should defer to WCAG 2.0 because it was written by experts.

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review, WHATWG | 2 Comments »

This Week in HTML 5 – Episode 13

Tuesday, November 18th, 2008

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The big news this week is a major revamping of how browsers should process multimedia in the <audio> and <video> elements.

r2404 makes a number of important changes. First, the canPlayType() method has moved from the navigator object to HTMLMediaElement (i.e. a specific <audio> or <video> element), and it now returns a string rather than an integer. [canPlayType() discussion]

The canPlayType(type) method must return the string "no" if type is a type that the user agent knows it cannot render; it must return "probably" if the user agent is confident that the type represents a media resource that it can render if used in with this audio or video element; and it must return "maybe" otherwise. Implementors are encouraged to return "maybe" unless the type can be confidently established as being supported or not. Generally, a user agent should never return "probably" if the type doesn't have a codecs parameter.

Wait, what codecs parameter? That's the second major change: the <source type> attribute (which previously could only contain a MIME type like "video/mp4", which is insufficient to determine playability) can now contain a MIME type and a codecs parameter. As specified in RFC 4281, the codecs parameter specifies the specific codecs used by the individual streams within the audio/video container. The section on the type attribute contains several examples of using the codecs parameter.

Third, the <source type> attribute is now optional. If you aren't sure what kind of video you're serving, you can just throw one or more <source> elements into a <video> element and the browser will try each of them in the order specified [r2403] until it finds something it can play. [load() algorithm discussion] Of course, if you include a type attribute (and codecs parameter within it), the browser may use it to determine playability without loading multiple resources, but this is no longer required.

The final change (this week) to multimedia elements is the elimination of the start, end, loopstart, loopend, and playcount attributes. They are all replaced by a single attribute, loop, which takes a boolean. To handle initially seeking to a specific timecode (like the now-eliminated start attribute), the HTML 5 spec vaguely declares, "For example, a fragment identifier could be used to indicate a start position." This obviously needs further specification.

One multimedia-related issue that did not change in the spec this week is same-origin checking for media elements. Robert O'Callahan asked whether video should be allowed to load from another domain, noting (correctly) that it could lead to information leakage about files posted on private intranets. Chris Double outlines the issues and some proposed solutions. However, contrary to Chris' expectation, HTML 5 will not (yet) mandate cross-site restrictions for audio/video files. This is an ongoing discussion. [WHATWG discussion thread, Theora discussion thread]

In other news, Ian Hickson summarized the discussion around the <input placeholder> attribute (which I first mentioned in This Week in HTML 5 Episode 8) and committed r2409 that defines the new attribute:

The placeholder attribute represents a short hint (a word or short phrase) intended to aid the user with data entry. A hint could be a sample value or a brief description of the expected format.

For a longer hint or other advisory text, the title attribute is more appropriate.

The placeholder attribute should not be used as an alternative to a label.

User agents should present this hint to the user only when the element's value is the empty string and the control is not focused (e.g. by displaying it inside a blank unfocused control).

Read the section on the placeholder attribute for an example of its proper use.

Other interesting tidbits this week:

Ian Hickson explains why the <link rev> attribute was dropped.
Lots of feedback about the Web Workers draft.
Lots of feedback about the WebSocket interface.
r2413 adds a resolveURL() method to the window.location object. [resolveURL() discussion]
r2406 defines negative pixelratio.
r2407 defines how often browsers should fire the timeupdate event while playing audio or video.
r2416 simplifies the content model of the <menu> element. [<menu> discussion]

Around the web:

The W3C published an editor's draft of Lachlan Hunt's Web Developer's Guide to HTML 5.
Austin Chau posted a demo of HTML 5 cross-document messaging. Further discussion: Using HTML5 postMessage, postMessage API changes, and the unfortunately-named xssinterface library which implements a postMessage-like API in browsers that do not yet support it.
Ryan Tomayko posted an excellent summary of things caches do, specifically HTTP caches like Squid and rack-cache.
Joe Clark posted This is How the Web Gets Regulated, a call to action on video accessibility.
mv_embed is a GPL-licensed Javascript shim that takes <video> elements that point to Ogg Theora video files and replaces them with plugin-specific markup to play the video through oggplay, vlc-plugin, Java cortado, mplayer, Totem, or Apple Quicktime (if Xiph's Ogg Theora Quicktime component is installed). A demo page demonstrates the technique.
Everyone should go admire my new dog Beauregard, then scroll down to read "dave"'s non-Beau-related but extremely interesting comment on an experimental Ogg Theora video encoder. From there, I learned about this Ogg Vorbis audio decoder written in pure ActionScript (Flash), leading to the tantalizing but as-yet-unrealized possibility of a Javascript shim like mv_embed that could take <audio> elements that point to Ogg Vorbis audio files and replace them with a Flash wrapper that could play the audio file, even in browsers that do not support the <audio> element or the Ogg Vorbis audio codec.

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 4 Comments »

This Week in HTML 5 – Episode 10

Monday, October 20th, 2008

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The big news this week is offline caching. This has been in HTML 5 for a while, but this week Ian Hickson caught up with his email and integrated all outstanding feedback. He summarizes the changes:

Made the online whitelist be prefix-based instead of exact match. [r2337]

Removed opportunistic caching, leaving only the fallback behavior part. [r2338]

Made fallback URLs be prefix-based instead of only path-prefix based (we no longer ignore the query component). [r2343]

Made application caches scoped to their browsing context, and allowed iframes to start new scopes. By default the contents of an iframe are part of the appcache of the parent, but if you declare a manifest, you get your own cache. [r2344]

Made fallback pages have to be same-origin (security fix). [r2342]

Made the whole model treat redirects as errors to be more resilient in the face of captive portals when offline (it's unclear what else would actually be useful and safe behavior anyway). [r2339]

Fixed a bunch of race conditions by redefining how application caches are created in the first place. [r2346]

Made 404 and 410 responses for application caches blow away the application cache. [r2348]

Made checking and downloading events fire on ApplicationCache objects that join an update process midway. [r2353]

Made the update algorithm check the manifest at the start and at the end and fail if the manifest changed in any way. [r2350]

Made errors on master and dynamic entries in the cache get handled in a non-fatal manner (and made 404 and 410 remove the entry). [r2348]

Changed the API from .length and .item() to .items and .hasItem(). [r2352]

And now, a short digression into video formats...

You may think of video files as "AVI files" or "MP4 files". In reality, "AVI" and "MP4" are just container formats. Just like a ZIP file can contain any sort of file within it, video container formats only define how to store things within them, not what kinds of data are stored. (It's a little more complicated than that, because container formats do limit what codecs you can store in them, but never mind.) A video file usually contains multiple tracks -- a video track (without audio), one or more audio tracks (without video), one or more subtitle/caption tracks, and so forth. Tracks are usually inter-related; an audio track contains markers within it to help synchronize the audio with the video, and a subtitle track contains time codes marking when each phrase should be displayed. Individual tracks can have metadata, such as the aspect ratio of a video track, or the language of an audio or subtitle track. Containers can also have metadata, such as the title of the video itself, cover art for the video, episode numbers (for television shows), and so on.

Individual video tracks are encoded with a certain video codec, which is the algorithm by which the video was authored and compressed. Modern video codecs include H.264, DivX, VC-1, but there are many, many others. Audio tracks are also encoded in a specific codec, such as MP3, AAC, or Ogg Vorbis. Common video containers are ASF, MP4, and AVI. Thus, saying that you have sent someone an "MP4 file" is not specific enough for the recipient to determine if they can play it. The recipient needs to know the container format (such as MP4 or AVI), but also the video codec (such as H.264 or Ogg Theora) and the audio codec (such as MP3 or Ogg Vorbis). Furthermore, video codecs (and some audio codecs) are broad standards with multiple profiles, so saying that you have sent someone an "MP4 file with H.264 video and AAC audio" is still not specific enough. An iPhone can play MP4 files with "baseline profile" H.264 video and "low complexity" AAC audio. (These are well-defined technical terms, not laymen's terms.) Desktop Macs can play MP4 files with "main profile" H.264 video and "main profile" AAC audio. Adobe Flash can play MP4 files with "high profile" H.264 video and "HE" AAC audio. Of course, it's a little more complicated than that.

Thus...

r2332 adds a navigator.canPlayType() method. This is intended for scripts to query whether the client can play a certain type of video. There are two major problems with this: first, MIME types are not specific enough, as they will only describe the video container. Learning that the client "can play" MP4 files is useless without knowing what video codecs it supports inside the container, not to mention what profiles of that video codec it supports. The second problem is that, unless the browser itself ships with support for specific video and audio codecs (as Firefox 3.1 will do with Ogg Theora and Ogg Vorbis), it will need to rely on some multimedia library provided by the underlying operating system. Windows has DirectShow, Mac OS X has QuickTime, but neither of these libraries can actually tell you whether a codec is supported. The best you can do is try to play the video and notice if it fails. [WHATWG thread]

Other interesting changes and discussions this week:

r2333 changes the data type of the width and height attributes on <embed>, <object>, and <iframe> elements to match current browser behavior. These attributes reflect strings, not integers. No one knows why.
Ian Hickson kicked off another round of video accessibility discussion, with this philosophical statement:

Fundamentally, I consider <video> and <audio> to be simply windows onto pre-existing content, much like <iframe>, but for media data instead of for "pages" or document data. Just as with <iframe>s, the principle I had in mind is that it should make sense for the user to take the content of the element and view it independent of its hosting page. You should be able to save the remote file locally and open it in a media player and you should be able to write a new page with a different media player interface, without losing any key aspect of the media. In particular, any accessibility features must not be lost when doing this. For example, if the video has subtitles or PiP hand language signing, or multiple audio tracks, or a transcript, or lyrics, or metadata, _all_ of this data should survive even if the video file is saved locally without the embedding page.

In other words, video accessibility should be handled within the video container, not in the surrounding HTML markup. On the plus side, all modern video containers can handle subtitle tracks, secondary audio tracks, and so forth. Unfortunately, authors may be hesitant to add to their bandwidth costs by including tracks that must be downloaded by everyone but appreciated (or even noticed) by very few.

[W3C discussion thread on video accessibility]
Sander van Zoest noticed the pixelaspectratio attribute of the <video> element, and he asked why it was a float instead of a ratio of two rationals (as is standard practice in the video authoring world). Ultimately, he agreed with Eric Carlson that pixelaspectratio should be dropped from HTML 5 because it doesn't really give enough information about how to scale the video properly. As with so many other things in the video world, the problem is much more complicated that it first appears.

Around the web:

Chris Double posted an update on Firefox 3.1's Ogg Theora video support and a list of demo sites.
WGBH outlines the regulations surrounding closed captioning and video description and the state of the art for captioning and description for the Web.
Joe Clark, in his book Building Accessible Websites, states "Video is still a bit of a boondoggle. And making video accessible is so difficult you had best leave the job to the experts. And at present, there is no way for you the Web developer to become an expert." The book was published in 2002, but the point remains true today.
A list of use cases for video accessibility
Multimedia accessibility proposals

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 5 Comments »

This Week in HTML 5 – Episode 6

Tuesday, September 23rd, 2008

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

There is no big news this week. Work continued on last week's orgy of Web Forms-related check-ins. This week adds the <label> element and the jack-of-all-forms <input> element. [r2191, r2192, r2197, r2200, r2202, r2204, r2205, r2207, r2211, r2212, r2213, r2214, r2218, r2219, r2220, r2222, r2223]

Laura Carlson and others have begun to review the accessibility of multimedia on the web. Most accessibility discussions revolve around the needs of visually impaired users, but hearing impaired users are also important and too often ignored. There was a long discussion last month (and continuing into this month) about the accessibility implications of the <audio> and <video> elements for hearing impaired users. YouTube (owned by Google, my employer) recently announced support for captions on YouTube videos and published a tutorial on adding them to your own videos.

Ian Hickson (the HTML 5 editor) gave an interview about HTML 5 in which he reiterated his goal of having two independent, complete, interoperable implementations of HTML 5 by 2022. (By contrast, HTML 4.0 was "finalized" 11 years ago but still doesn't have two independent, complete, interoperable implementations.) This led to a mini-firestorm among bloggers who misunderstood "2022" as "the date when I can start using HTML 5 features." It bears repeating that the "2022" date has no significance at all for web developers. Most browser vendors are actively involved in HTML 5, several browsers are already shipping HTML 5 features, and developers who are holding their breath until 2022 are going to find themselves seriously behind the curve.

On that note, Brenton Strine asks a very good question: "Is there some place that documents the parts of HTML 5 that are already up and running? Can I use <canvas> or <video>? In which browsers? What other tags can I use? What other fancy HTML 5 stuff can I do today in 2008?" On the video front, Mozilla will be shipping Ogg Theora support in Firefox 3.1. (You can read more about why Ogg matters.) Last year, Opera released experimental builds with Ogg Theora support, and they now have video-enabled builds on 3 platforms. The Wikimedia Foundation has a few Theora-encoded videos you can watch.

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 6 Comments »