What’s next in HTML, episode 2: who’s been peeing in my sandbox?

January 26th, 2010 by Mark Pilgrim, Google

Welcome back to “What’s Next in HTML,” where I’ll try to summarize the major activity in the ongoing standards process in the WHAT Working Group. With HTML5 in Last Call, the WHATWG has moved to an unversioned development model for HTML. While browser vendors are busy implementing HTML5, let’s talk about what’s next.

The big news in HTML this week is r1643. ... Well, technically that revision is over 20 months old, but there have been a flurry of updates that affect the underlying feature. What feature, you might ask? Sandboxing untrusted content.

The sandbox attribute, when specified [on an <iframe> element], enables a set of extra restrictions on any content hosted by the iframe. ... When the attribute is set, the content [hosted by the iframe] is treated as being from a unique origin, forms and scripts are disabled, links are prevented from targeting other browsing contexts, and plugins are disabled.

This could be useful for all kinds of scenarios. The HTML5 spec lists some examples of blog comments, but I think that’s mostly a red herring. Think about what’s hosted in iframes today: third-party advertising and third-party widgets. In each case, a web author wants to embed something on their page that they have little or no control over. In practice, that usually works fine. Advertising iframes don’t do anything (except display ads). Most widgets are well-behaved, and most widget frameworks (like Google Gadgets) enforce terms of service that forbid widgets from “taking over” the parent page in which they are embedded. Still, that’s a social/legal solution, not a technical one. Sandboxing is a complementary technical solution, where the parent page can actually tell the browser “Hey, I don’t fully trust this thing, but I’m embedding it anyway. Can you reduce its privileges?”

What privileges? Well, by default, “sandboxed” iframes can not

There are ways for the parent page to add back each of these privileges, if the third-party content needs it.

[The sandbox attribute’s] value must be an unordered set of unique space-separated tokens. The allowed values are allow-same-origin, allow-forms, and allow-scripts. The allow-same-origin keyword allows the content to be treated as being from the same origin instead of forcing it into a unique origin, and the allow-forms and allow-scripts keywords re-enable forms and scripts respectively (though scripts are still prevented from creating popups).

So it’s a security feature. You could restrict an advertising iframe to have no privileges whatsoever, but you could give a widget iframe privileges to execute its own scripts or embed its own forms.

If it’s a security feature, won’t older browsers still be insecure?

Yes. Well, no more than they are now. In fact, very few browsers support the sandbox attribute today, so we’re not just talking about users of older browsers — we’re talking about pretty much everyone. But that’s OK. The sandbox attribute is designed to be an incremental security feature. It’s an additional layer of security, not the only layer. Browsers have supported iframes for a long time, and thousands of web authors are using them despite the very real risks of embedding untrusted content. Advertising networks can and have been hacked; malicious widgets can and have been published; bad actors can and do try to do bad things to as many people as possible until they’re caught and taken down. You need to keep doing all the things you’re doing now to prevent iframe-based attacks. Then add sandbox, too.

I can’t do any filtering or sanitizing. Can I rely solely on browser-based sandboxing?

Someday, you might — might! — be able to throw out all your sanitizing code and rely solely on the sandbox attribute. Of course, you can’t do that today, because users of older browsers would still be vulnerable. So we need a “clean break” solution — a way to serve untrusted content to supporting browsers while absolutely, positively, 100% ensuring that older browsers never render the untrusted content under any circumstances. Enter the text/html-sandboxed MIME type.

All HTML pages are served with the text/html MIME type. It’s part of the HTTP headers, normally invisible to end users, but nevertheless sent by web servers every time a client requests a page. Every resource type (images, scripts, CSS files) has its own MIME type. Untrusted content could have its own MIME type. And this is where text/html-sandboxed comes in. If my web server serves up an HTML page with a MIME type of text/html, your browser will render it. If my web server serves up the same HTML page with a MIME type of text/html-sandboxed, you browser will download it (or offer to download it). Your browser doesn’t recognize that MIME type, so it falls back to the default action, which is to download it and save it as a file on your local disk. We can use this behavior to our advantage.

As browsers start supporting the sandbox attribute, they can also start supporting the text/html-sandboxed MIME type. What does it mean to “support” this new MIME type? If a user navigates directly to a page served with the new MIME type, don’t do anything special. Just download it, which is what happens already. BUT... if the user navigates to a page that includes an <iframe> element, AND the iframe has a sandbox attribute, AND the src of the iframe points to an HTML page that is served with the text/html-sandboxed MIME type, THEN render the iframe as normal (but still subject to the restrictions listed in the sandbox attribute).

Older browsers will download (or offer to download) the untrusted content. From a security perspective, that’s a good thing — at least, it means the content won’t be rendered as HTML. From a usability perspective, that’s terrible. Who wants to go to a page and suddenly have the browser offering to download a bunch of useless files? That means that you won’t really be able to use this technique until all users have upgraded to a browser that supports both the sandbox attribute and the text/html-sandboxed MIME type. That will be... a while. But it might happen someday!

Iframes suck. Can’t I just include the untrusted content inline?

There have been a number of proposals for a <sandbox> element, which you could wrap around untrusted content. All such proposals suffer fatal flaws, stemming from how today’s browsers parse HTML markup. You, the author who wants to “wrap” untrusted content, would need to ensure that the content did not “break out” of the sandbox. For instance, it could include an </sandbox> element. (Hey, it’s untrusted! That’s why we’re here in the first place.) There are a surprising number of variations of markup that are recognized as end tags (having to do with inserting whitespace characters in strange places), and you would be responsible for sanitizing all of these variations. Furthermore, you would need to ensure that the untrusted content did not include a script that called document.write(), which could be used for writing out a matching </sandbox> end tag programmatically. Think about the number of ways that script could be obfuscated, and pretty soon you’re asking individual web authors to solve the halting problem just to wrap some untrusted content.

If a wrapper element is the wrong solution, what’s the right one? This is where the “flurry of updates” has been happening. The current solution is r4619: the srcdoc attribute (with minor updates in r4623, r4624, and r4626). The best way to explain it is by example:

<iframe sandbox srcdoc="<p>Markup in an attribute, woohoo!</p>"></iframe>

Yeah, that’s pretty janky. But it has the following nice qualities:

It also has the following not-so-nice qualities:

There is one exception to that last rule. There are a few comment systems that are entirely client-side. That is, the comments are not part of the page markup that comes down from the web server; they are programmatically added after the page is rendered. Such comment systems could use JavaScript-based feature detection to check whether the browser supported the srcdoc attribute, and write out the appropriate markup either way. I wrote the book on HTML5 feature detection. (No really! A whole fscking book!) Detecting srcdoc support would use detection technique #2:

if ("srcdoc" in document.createElement("iframe")) { ... }

But this would only help in the case where you were adding untrusted content to the page at runtime, on the client side. Server-side cases will have to wait until everybody upgrades.

So when can I use all this stuff?

Hahahahahaha. You must be new here.

No really, when?

There are several pieces here, each with their own compatibility story.

  1. The sandbox attribute, for reducing privileges of untrusted content. Chromium and Google Chrome support the sandbox attribute (I tested the dev channel version 4.0.302.3); Safari, Firefox, Internet Explorer, and Opera ignore it. So you can start using the sandbox attribute today — just be sure to test in Chromium or Google Chrome to ensure you’ve set the sandbox privileges properly. It won’t have any effect in other browsers, but that’s OK. Remember, the sandbox attribute isn’t designed to be your only line of defense; it’s a complement to your existing defenses. Keep doing whatever you’re doing now (sanitizing input, auditing code, enforcing legal terms with your partners, etc), then add sandbox for extra protection.
  2. The text/html-sandboxed MIME type, for ensuring that users can’t navigate to untrusted content. There are two parts to this. First, browsers must not render pages served with a text/html-sandboxed MIME type, if you navigate to the page directly. This part works in all browsers, today; they all download (or offer to download) the page markup instead of rendering it. Second, browsers that support the sandbox attribute need to render iframes served with the text/html-sandboxed MIME type (subject to the privilege restrictions listed in the sandbox attribute). No browser supports this yet, not even Google Chrome. (It renders the parent page but downloads the iframe content instead of rendering it within the frame.) So you can’t use this technique yet, until Google updates Chrome to support it. (In theory, other browser vendors will implement support for this at the same time they implement support for the sandbox attribute, but I suppose we’ll just have to wait and see.)
  3. The srcdoc attribute, for including untrusted content inline. Since the fallback behavior in legacy browsers for this feature is “render nothing at all” (by design), this attribute won’t be useful until pretty much all of your visitors upgrade to browsers that support the attribute. At the moment, no current browser supports the srcdoc attribute, so it’ll be a while. If I had to guess, I’d say January 29, 2022, at 4:37pm. Plus or minus 10 years.

And now you know “What’s Next in HTML.”

Posted in What's Next | 11 Comments »

What’s Next in HTML, episode 1

January 13th, 2010 by Mark Pilgrim, Google

Welcome to "What's Next in HTML," where I'll try to summarize the major activity in the ongoing standards process in the WHAT Working Group. Wait... what happened to This Week in HTML5? Hell, what happened to HTML5? Well, nothing. It took over five years to create, but it's in Last Call now. By all measures, it has already been wildly successful. Browser vendors are implementing it, books are being written, we have a kick-ass validator, web developers are slowly catching on, and there's still plenty of time to send us your feedback. But in the meantime, the WHAT Working Group has begun work on new, experimental features for the next version of HTML.

The next version of HTML doesn't have a name yet. In fact, it may never have a name, because the working group is switching to an unversioned development model. Various parts of the specification will be at varying degrees of stability, as noted in each section. But if all goes according to plan, there will never be One Big Cutoff that is frozen in time and dubbed "HTML6." HTML is an unbroken line stretching back almost two decades, and version numbers are a vestige of an older development model for standards that never really matched reality very well anyway. HTML5 is so last week. Let's talk about what's next.

The big news in HTML is r4439, which adds the device element. What's a <device>? I'm glad you asked.

The device element represents a device selector, to allow the user to give the page access to a device, for example a video camera.

The type attribute allows the author to specify which kind of device the page would like access to.

So it's for video conferencing, something you can currently only do with Adobe Flash or other proprietary plugins that sit on top of your browser. In fact, most of the pieces for browser-based video chat are already in place. The idea is that a device element would go hand in hand with a video element and a web socket. The device records a video stream (using the also-newly-defined Stream API) and sends the stream of video along a web socket to the other party (perhaps via an intermediate server) which renders the stream in a video element. And like the video element, the device element would be native to your browser, so browser vendors would not have to wait for third parties to add specific support for their platform.

Does all that work yet? Hell no. We don't even have a standard video codec yet! Google Chrome is the only browser that has shipped an implementation of web sockets (although it's part of WebKit, so presumably Apple could ship it in a future version of Safari if they choose). And the entire device API is still in its infancy. Nobody has even started implementing a prototype of that piece yet, and the whole idea might be scrapped by my next episode. But that's life on the bleeding edge.

And now you know "What's Next in HTML."

Posted in What's Next | 35 Comments »

Implementation progress on the HTML5 <ruby> element

November 13th, 2009 by MikeSmith

If you don't know what the HTML5 ruby element is, you might want to take a minute to first read the section about the ruby element in the HTML5 specification and/or the Wikipedia article on ruby characters. To quote from the HTML5 description of the ruby element:

The ruby element allows one or more spans of phrasing content to be marked with ruby annotations. Ruby annotations are short runs of text presented alongside base text, primarily used in East Asian typography as a guide for pronunciation or to include other annotations. In Japanese, this form of typography is also known as furigana.

I give a specific example further down, but for now I want to first say that the really great news about the ruby element is that last week, Google Chrome developer Roland Steiner checked in a change (r50495, and see also related bug 28420) that adds ruby support to the trunk of the WebKit source repository, thus making the ruby feature available in WebKit nightlies and Chrome dev-channel releases.

A simple example

The following is a simple example of what you can do with the ruby element; make sure to view it in a recent WebKit nightly or Chrome dev-channel release. Note that the text is an excerpt from the source of a ruby-annotated online copy of the short story Run, Melos, Run by the writer Osamu Dazai, which I came across by way of Piro's info page for his XHTML Ruby add-on for Firefox (and which I mention a bit more about further below).

きのうの豪雨で山の水源地は<ruby>氾濫<rp>(</rp>
<rt>はんらん</rt><rp>)</rp></ruby>し、濁流
<ruby>滔々<rp>(</rp><rt>とうとう</rt><rp>)</rp>
</ruby>と下流に 集り、猛勢一挙に橋を破壊し、どうどうと 
響きをあげる激流が、<ruby>木葉微塵<rp>(</rp>
<rt>こっぱみじん</rt><rp>)</rp></ruby>に<ruby>橋桁
<rp>(</rp><rt>はしげた</rt><rp>)</rp></ruby>
を跳ね飛ばしていた。

If you don't happen to have Japanese fonts installed, here's a screenshot of the source for reference:

ruby source markup

Notice that the actual annotative ruby text (which I've highlighted in yellow in the source just for the sake of emphasis) is marked up using the rt element as a child of the ruby element, and the text being annotated is the node that's a previous sibling to that rt content as a child of the ruby element. The final new element in the mix is the rp element, which is simply a way to mark up the annotative ruby text with parenthesis, for graceful fallback in browsers that don't support ruby.

So here's the rendered view of that same text:

見よ、前方の川を。きのうの豪雨で山の水源地は氾濫はんらんし、濁流滔々とうとうと下流に集り、猛勢一挙に橋を破壊し、どうどうと響きをあげる激流が、木葉微塵こっぱみじん橋桁はしげたを跳ね飛ばしていた。

And here is a screenshot of how it should look in a recent WebKit nightly or Chrome dev-channel release:

ruby rendered view

Notice that the annotative ruby text is displayed above the ruby base it annotates. If you instead view this page in a browser that doesn't support the ruby feature, you'll see that the ruby text is just shown inline, in parenthesis following the ruby base it annotates. So the feature falls back gracefully in older browsers.

Support in other browsers

Current versions of Microsoft Internet Explorer also have native support for ruby, and you can also get ruby support in Firefox by installing Piro's XHTML Ruby add-on (and for more details, see his XHTML ruby add-on info page) — so we are well on the way to seeing the HTML5 ruby feature supported across a range of browsers. If you're not accustomed to reading printed books and magazines and such in Japanese, that might not sound like such a big deal. But for authors and developers and content providers in Japan who want to finally be able to use on the Web this very common feature of Japanese page layout from the print world, getting ruby support into another major browser engine is a huge win, and something to be very excited about.

Posted in Browsers, Elements | 4 Comments »

HTML5 at Last Call

October 27th, 2009 by Ian Hickson

For a brief period today, there were no outstanding e-mails or bugs on the specs, and so I took that opportunity to transition us here at the WHATWG to the next stage of HTML5's development: Last Call! This affects three specs at the WHATWG:

There's also a version of the spec called Web Applications 1.0 (for nostalgic reasons) that has all of the above as well as a number of other specs, namely Web Storage, Web Database, Server-sent Events, and the Web Sockets API and protocol, all together in one document. With the exception of the Web Database spec, they're all now in last call at the WHATWG.

So if you've been waiting to see if someone else would report the problem that you had seen, well, if it's not fixed, they didn't! So you should now send that feedback in yourself.

There's two ways to send feedback. If your feedback is something short and simple, you can just load up the spec in your browser, click on the section with the problem, then type in your message using the review comments box that appears at the bottom of the window, and hit the "Submit Review Comments" button. This works for the HTML5 and Web Applications 1.0 specs. (Thanks to the W3C HTML Working Group for making their bug database available to us for this purpose.)

If your feedback is more elaborate, then you should subscribe to the mailing list and then send your feedback there.

Note: Lest there be any confusion, the W3C HTML WG has not yet transitioned HTML5 to Last Call at the W3C. HTML5 is a joint effort of W3C and WHATWG groups, but we have different issues lists and different criteria for going to Last Call. For more details on the W3C HTML WG's processes, see the W3C HTML WG charter.

Posted in WHATWG | 11 Comments »

This Week in HTML5 – Episode 38

October 20th, 2009 by Mark Pilgrim, Google

Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.

This week, there were some more refinements to microdata. r4139 changes the names of the DOM properties that reflect microdata markup. r4140 renames the content property to itemValue Since no browser has actually implemented this API yet, these changes shouldn't make any difference. Standards are like sex; one mistake, and you're stuck supporting it forever! r4141 and r4147 fix up some microdata examples, in particular this example from Gavin Carothers about marking up O'Reilly's book catalog. Hooray for real-world examples!

There were also some noteworthy changes to the <video> and <audio> API. r4131 says that setting the src attribute on one of those elements should call its load() method. r4132 removes the load event for multimedia elements, and r4133 removes the "in progress" events (loadstart, loadend, and progress) that used to be fired while the video/audio file was downloading.

Other noteworthy changes this week:

Around the web:

Tune in next week for another exciting edition of "This Week in HTML5."

Tags: , , , ,
Posted in Weekly Review | 6 Comments »