The WHATWG Blog

Please leave your sense of logic at the door, thanks!

This Week in HTML 5 – Episode 28

Thursday, April 2nd, 2009

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The big news for the week of March 23rd is that SVG can once again be included directly in HTML 5 documents served as text/html:

I've made the following changes to HTML5:

  • Uncommented out the XXXSVG bits, reintroducing the ability to have SVG content in text/html.
  • Defined <script> processing for SVG <script> in text/html by deferring to the SVG Tiny 1.2 spec and blocking synchronous document.write(). The alternative to this is to integrate the SVG script processing model with the (pretty complicated) HTML script processing model, which would require changes to SVG and might result in a dependency from SVG to HTML5. Anne would like to do this, but I'm not convinced it's wise, and it certainly would be more complex than what we have now. If we ever want to add async="" or defer="" to SVG scripts, then this would probably be a necessary part of that process, though.
  • Added a paragraph suggesting: "To enable authors to use SVG tools that only accept SVG in its XML form, interactive HTML user agents are encouraged to provide a way to export any SVG fragment as a namespace-well-formed XML fragment."
  • Added a paragraph defining the allowed content model for SVG <title> elements in text/html documents.

r2904 (and, briefly, r2910) give all the details of this solution. There are still a number of differences between the text in HTML 5 and the proposal brought by the SVG working group. Some of these are addressed further down in the announcement:

Doug Schepers, who has been the SVG working group's HTML 5 liason, does not like this solution:

To be honest, I think it's not a good use of the SVG WG's time to provide feedback when Ian already has his mind made up, even if I don't believe that he is citing real evidence to back up his decision. What I see is this: one set of implementers and authors (the SVG WG) and the majority of the author and user community (in public comments) asking for some sort of preservation of SVG as an XML format, even if it's looser and error-corrected in practice, and a few implementers (Jonas and Lachy, most notably) disagreeing, and Ian giving preference to the minority opinion. Maybe there is sound technical rationale for doing so, but I haven't been satisfied on that score.

Turning to technical matters, one of the features of web forms in HTML 5 is allowing the attributes for form submission on either the <form> element (as in HTML 4) or on the submit button (new in HTML 5). Originally, the attributes for submit buttons were named action, enctype, method, novalidate, and target, which exactly mirrored the attribute names that could be declared on the <form> element.

However, in January 2008, Hallvord R. M. Steen (Opera developer) noted that "INPUT action [attribute] breaks web applications frequently. Both GMail and Yahoo mail (the new Oddpost-based version) use input/button.action and were seriously broken by WF2's action attribute."

Following up in November 2008, Ian Hickson replied, "I notice that Opera still supports 'action' and doesn't seem to have problems in GMail; is this still a problem?" to which Hallvord replied, "GMail fixed it on their side a while ago. It is still a problem with Yahoo mail, breaking most buttons in their UI for a browser that supports 'action'. We work around this with a browser.js hack. ('Still a problem' means 'I tested this again a couple of weeks ago and things were still broken without this patch'.)"

Ian replied, "This is certainly problematic. It's unclear what we should do. It's hard to use another attribute name, since the whole point is reusing existing ones... can we trigger this based on quirks mode, maybe? Though I hate to add new quirks." Hallvord did not like that idea: "In my personal opinion, I don't see why re-using attribute names is considered so important if we can find an alternative that feels memorable and usable. How does this look? <input type="submit" formaction="http://www.example.com/">"

Finally, in March 2009, Ian replied:

That seems reasonable. I've changed "action", "method", "target", "enctype" and "novalidate" attributes on <input> and <button> to start with "form" instead: "formaction", "formmethod", "formtarget", "formenctype" and "formnovalidate".

And thus we have r2890: Rename attributes for form submission to avoid clashes with existing usage.

Other interesting changes this week:

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 3 Comments »

This Week in HTML 5 – Episode 27

Thursday, April 2nd, 2009

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group. In this episode, I'd like to highlight some of the discussions that I've missed in previous episodes.

There has also been a vigorous debate about the license of the specification itself.

[Further reading: Discussions with plh, Draft W3C Excerpt License]

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 1 Comment »

This Week in HTML 5 – Episode 26

Thursday, April 2nd, 2009

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The big news for the week of March 16th is this announcement from Ian Hickson:

I've now split out the Server-sent Events and Storage APIs out of HTML5, and I've removed the text for Web Sockets, which was split out earlier. By popular demand I've also done some tweaks to the styling of these specs.

HTML5
http://dev.w3.org/html5/spec/
Server-Sent Events
http://dev.w3.org/html5/eventsource/
Web Storage
http://dev.w3.org/html5/webstorage/
Web Workers
http://dev.w3.org/html5/workers/
Web Sockets
http://dev.w3.org/html5/websockets/
http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol

It is my understanding that the desire is to publish the Server-Sent Events, Web Storage, Web Workers, and Web Sockets specs through the Web Apps working group, so that is what I put into the "status of this document" sections.

I would like to be able to put more permissive licenses (ideally MIT) on these drafts, rather than the W3C license.

The following sections still haven't been split out:

URLs
I'll remove this section as soon as DanC's draft is published.
Content-Type sniffing
I'll remove this section once Adam's draft is on a standards track.
Timeout API
This section is lacking an active editor.
Origin
I'm unsure what will happen with this section.

In IRC, Ian explained that all of these documents are generated from one master file:

# [21:02] <hixie> the source document is run through a bunch of scripts to generate the output documents
# [21:03] <hixie> from that one file i now generate one whatwg spec, four w3c specs, and an rfc

In other news, r2876 (WARNING: VERY LARGE) adds user stylesheets to the HTML 5 specification itself. If you view it in a browser that support switching stylesheets (such as Firefox, under the View → Page Style submenu), you can choose between "Complete specification" (default), "Author documentation only," or "Highlight implementation requirements." The "Author documentation only" stylesheet hides all of the client parsing algorithms and focuses on the elements, attributes, and scripting features that web authors need to know about.

For example, the "author documentation" of the <img> element highlights the required attributes, how to create a new Image() dynamically, and the detailed requirements for providing alternate text, while completely hiding any mention of how image fetching fits into the client's task queue, the gory details of how clients resolve image URLs, or the security risks of allowing pages on the public internet to attempt to load images on the local network. On the flip side, "highlight implementation requirements" highlights these exact issues.

Critics who complained that the HTML 5 specification should be "just a markup language" will be able to have their cake and eat it too. Those who complained that HTML 5 was "too bloated" will have a little less to complain about now that several parts of it have been published as separate documents. On the other hand, critics who complained about these things as a cover for other agendas will have to continue complaining a little while longer.

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 2 Comments »

This Week in HTML 5 – Episode 25

Tuesday, March 31st, 2009

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

The big news for the week of February 23rd (yes, I'm that far behind) is a collection of changes about how video is processed. The changes revolve around the resource selection algorithm to handle cases where the src attribute of a <video> element is set dynamically from script.

Background reading on the resource selection algorithm: Re: play() sometimes doesn't do anything now that load() is async, r2849: "Change the way resources are loaded for media elements to make it actually work", r2873: "make all invokations of the resource selection algorithm asynchronous."

Other video-related changes this week:

Another big change this week is the combination of r2859, 2860, and r2861: you can now declare the character encoding of XHTML documents served with the application/xhtml+xml MIME type by using the <meta charset> attribute, but only if the value is "UTF-8". Also, the charset attribute must appear in the first 512 bytes of the document. Previously, the only ways to control the character encoding of an application/xhtml+xml document were setting the charset parameter on the HTTP Content-Type header, or to use an encoding attribute in the XML prolog.

In practice, this will make no difference to encoding detection algorithms; other UTF-* encodings are detected earlier (with a Byte Order Mark), and any other encoding would require an XML prolog. This is mainly to address the desire of a few overly vocal authors to be able to serve the same markup in both text/html and application/xhtml+xml modes. Background reading: Bug 6613: Allow <meta charset="UTF-8"/> in XHTML.

Other interesting changes this week:

Tune in next week for another exciting episode of "This Week in HTML 5."

Posted in Weekly Review | 1 Comment »

This Week Day in HTML 5 – Episode 24

Tuesday, March 10th, 2009

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group. The pace of HTML 5 changes has reached a fever pitch, so I'm going to split out these episodes into daily (!) rather than weekly summaries until things calm down.

The big news for February 13, 2009 is r2814/r2815, which adds a .value property on file upload controls:

On getting, [the .value property] must return the string "C:\fakepath\" followed by the filename of the first file in the list of selected files, if any, or the empty string if the list is empty. On setting, it must throw an INVALID_ACCESS_ERR exception.

According to bug 6529, Opera already implements something close to this, and is willing to modify their implementation to match the spec text.

The other big news of the day is r2804/r2805, which defines what happens when a focused element becomes hidden.

For example, this might happen because the element is removed from its Document, or has a hidden attribute added. It would also happen to an input element when the element gets disabled.

But let's back up a step. What does it mean for an element to be "focused" in the first place? The spec has a whole section on the concept of focus. The short answer is that focus is just what you think it is: it's the element that responds when you type. As you tab around a form, the focus moves between form fields, where you can type information; as you tab around a page, the focus moves between links, which you can follow; if you tab long enough, focus moves out of the page and back to the location bar, and so forth.

The not-so-short answer that everything in the previous paragraph is platform-specific and, depending on your browser, highly customizable. The HTML 5 spec respects these platform differences, and defers the definition of what elements should be focusable to the browser, which ultimately respects the conventions of the underlying operating system.

The long answer is that virtually anything can be focusable, because HTML 5 standardizes a crucial accessibility feature that most modern browsers now implement, namely that any element can have a tabindex attribute.

It may not sound like it, but this is a really, really important feature for building accessible web applications. HTML 4 restricted the tabindex attribute to links and form fields. For reasons unknown, Internet Explorer ignored that and respected a tabindex attribute on any element. Starting with Firefox 2, Mozilla co-opted this implementation and wrote up a rough "standard" about using the tabindex attribute on anything. Once it was implemented in the browser, IBM contracted with major screenreader vendors to support Firefox's (and IE's) behavior. This provided the foundation for pure-Javascript widgets to become keyboard-accessible (also implemented by IBM). Not just theoretically, but actually, in real shipping browsers and real shipping screenreaders. Under the covers, Dojo's complex widgets are marked up with semantically meaningless <div>s and <span>s, yet they are still focusable and keyboard-navigable. The controls are in the tab order, so you can focus with the keyboard, then you can use the keyboard to further manipulate them, change their state, and so forth.

In HTML 4, there was no way to put custom controls into the tab order without breaking markup validity (unless you futzed around with custom DTDs or scripting hacks), because the tabindex attribute could only be used on links and form fields. HTML 5 "paves the cowpaths" and standardizes this definition and behavior of tabindex-on-anything. This is a huge step forward for web accessibility. The concept of focus is central to accessibility, and HTML 5 gives it the attention it deserves. (There's more to making controls accessible than just keyboard navigability, but if you don't have keyboard navigability, nothing else really matters. If you're creating your own custom Javscript widgets, you must read the (non-vendor-specific) DHTML Style Guide for implementing keyboard accessibility in custom controls.)

Now then, why is r2804 important? Well, if the element that has focus suddenly can't have focus anymore -- because it was programmatically hidden or disabled, or it was removed from the DOM altogether -- then it is vitally important to specify where the focus goes. So the HTML 5 specification lays it all out:

When an element that is focused stops being a focusable element, or stops being focused without another element being explicitly focused in its stead, the user agent should run the focusing steps for the body element, if there is one; if there is not, then the user agent should run the unfocusing steps for the affected element only.

[Background reading: Re: Lose Focus When Hidden? (SVG ISSUE-2031)]

Other interesting changes of the day:

Tune in... well, sometime soon-ish for another exciting episode of "This Week Day In HTML 5."

Posted in Weekly Review | 1 Comment »