On August 7, 2009, Adrian Bateman did what no man or woman had ever done before: he gave substantive feedback on the current editor's draft of HTML5 on behalf of Microsoft. His feedback was detailed and well-reasoned, and it spawned much discussion. See also: Adrian's followup on
<progress> (and more here); his followup on
<datagrid> (which was actually dropped from HTML5 just minutes after Adrian posted his initial feedback, for unrelated reasons); his discussion with a Mozilla developer about
<bb> (which was also subsequently dropped); discussion about
<dialog> (which has now been dropped -- more on that in just a minute); Adrian's followup on
<keygen>, with additional concerns listed here; his followup on the new input types; and last but not least, his position on
<audio> (and followup about best-choice algorithms).
As you might expect, much of the discussion since August 7 has been driven by Microsoft's feedback. After five years of virtual silence, nobody wants to miss the opportunity to engage with a representative of the world's still-dominant browser. I want to focus on two discussions that have led to recent spec changes.
The rise and fall of
<keygen> element was invented by Netscape and subsequently reverse-engineered by every other browser vendor except Microsoft. It had never been part of any HTML specification before; indeed, it wasn't well-documented anywhere. It was added to HTML5 earlier this year (and covered in episode 12 and episode 31). The spec text borrows heavily from this incredibly detailed documentation posted to the WHATWG mailing list last July.
Adrian, on behalf of Microsoft, has stated in no uncertain terms that Microsoft has no intention of ever implementing the
<keygen> element, and they would like it to be removed from HTML5. But what does "implementing"
<keygen> mean? Well, the point of
<keygen> is to provide a cryptography API, but as Ian Hickson points out, the element itself "integrates tightly with the form submission model, it affects the DOM APIs of other elements, it affects the parser, it affects the form control validity model -- it's not a feature that can be sensibly considered 'optional' if our goal is cross-browser interoperability. However, there is an alternative that I think would still satisfy Microsoft's desires to not implement
<keygen>'s cryptographic features while still bringing interoperability to the platform in every other respect: we could make the support of each individual signature algorithm optional."
r3843 does just that -- it makes the cryptography parts of
<keygen> optional. It is important that the element itself be implemented (even without the crypto bits), because it interacts with the DOM and the parsing model in strange ways.
As an postscript of sorts, I should point out that a recent change to the
keytype attribute (r3868) allows client-side script to detect whether the crypto bits are actually supported. Detecting features is important, and this subtle change will allow authors to write feature detection scripts instead of relying on browser sniffing to decide whether to use keygen-based cryptography.
The art of conversation is, like, dead and stuff
Another hot topic this week is the removal of the
<dialog> element. As I mentioned at the beginning of this article, Microsoft questioned the wisdom of a specialized element for marking up dialog. Other people have suggested that the element does not actually go far enough -- it lets you mark up basic conversation (people talking), but provides no semantics for stage directions, actions, thoughts, voiceover narration, and so on. (See also: Unwebbable, "The screenplay problem.")
To decide this burning question, the
<dialog> element was removed, and conversations are now listed in the section on Common idioms without dedicated elements, with an example of how one might mark up a conversation with more generic markup, if one were inclined to do so. Predictably, this has already caused some backlash from the pro-
The conversation about marking up conversations also intersects with another burning question: can you use the
<cite> element to mark up a person's name? HTML 4 said yes, and even provided an example that used the
<cite> element that way. Dan Connolly, who added the
<cite> element to HTML 2 (yes, 2), says "I consider that a bug in the HTML 4 spec. I wish I had reviewed it more closely." Still, specs are normative, not what their authors say about them after-the-fact, and the web has collectively had 12 years of HTML 4 which explicitly blessed the technique. I've used
<cite> to mark up people's names for years in my own markup, and I'm certainly not going to go back and change all those blog entries to conform to somebody's sense of purity.
Speaking of examples, the HTML5 spec just got a whole lot more of them. To wit:
- r3793 and r3856:
- r3801 and r3806:
- r3863: example of how to get the filename out of
- r3799: example of how to mark up comments on a weblog using
Other spec changes this week:
- r3852 allows
- r3846 clarifies that
<button type="reset">is excluded from form validation.
- r3841 clarifies that
<aside>may be used for sidebars, advertising, or groups of
<nav>elements. It can still be used for pull quotes, for example the sentence "Everything you thought you knew about strings is wrong" on a page about string processing.
- r3837 defines the concept of "being rendered." As with many things in HTML5, what scares me most about this change is that we survived so long without this being defined at all.
- r3853 adds a note about another willful violation, this time stripping the Byte Order Mark character (U+FEFF) even from places where it is not being used as a byte order mark.
Around the web:
- Francisco Ryan Tolmasky: On HTML 5 Drag and Drop.
- Jamie Newman: How to Draw with HTML 5 Canvas
- WHATWG wiki: Differences between HTML and XHTML is chock full of interesting comparisons.
- WHATWG wiki: Web Encodings, "an attempt at fixing the Web encoding problem." Good luck with that.
- Edward O'Connor: Blog templates: a case study in using HTML5’s sectioning elements. "The HTML5 spec introduces several new sectioning elements. ... There's widespread confusion about when to use these elements." No kidding.
- Mark Pilgrim: Detecting HTML5 Features. I'm currently writing a book on HTML5, to be published next February by O'Reilly and Google Press. This is the second chapter I've written so far. (The first chapter I wrote was Let's Call It a Draw(ing Surface), a still-unfinished tutorial on using the canvas API.)
- Aten Design Group: A Brief History of HTML.
- Jeremy Keith: The devil in the details, on the continuing saga of marking up conversations.
- Edward O'Connor: HTML5: normativity & authoring guides. "Not only is [HTML5] big, but it's full of complicated algorithms and other browser implementation details that even standards-aware web authors probably don't care about." Indeed.
- Sam Ruby: First Polyglot Validator Check Deployed. I personally believe this is a waste of time, since there are approximately 3 people in the world who actually serve polyglot documents, all of whom seem to be doing fine without this option, and the presence of the option will just serve to confuse the other 3 million validator users who think they're serving XHTML when they're not. But one of those 3 people has commit access to the HTML5 validator, so I guess we'll get to see whether I'm right or wrong.
Quote of the week:
Richard Schwerdtfeger, Distinguished Engineer and Chief Accessibility Architect at IBM, co-author of Accessible Rich Internet Applications (WAI-ARIA) 1.0, XHTML Access Module, and XHTML Role Attribute Module, in a discussion of
longdescwas a disaster.
On a personal note, I'm writing this on a new laptop with a sticky "3" key, a minor annoyance turned major by the fact that this week's revisions are numbered in the 3000 range. While many people complain about the pace of changes in HTML5, as far as I'm concerned, revision 4000 can not come quickly enough.
Tune in next week for another exciting edition of "This Week in HTML 5."