The WHATWG Blog

Archive for the ‘WHATWG’ Category

Infra

Wednesday, November 16th, 2016

Welcome to the newest standard maintained by the WHATWG: the Infra Standard! Standards such as DOM, Fetch, HTML, and URL have a lot of common low-level infrastructure and primitives. As we go about defining things in more detail we realized it would be useful to gather all the low-level functionality and put it one place. Infra seemed like a good name as it’s short for infrastructure but also means below in Latin, which is exactly where it sits relative to the other work we do.

In the long term this should help align standards in their vocabulary, make standards more precise, and also shorten them as their fundamentals are now centrally defined. Hopefully this will also make it easier to define new standards as common operations such as “ASCII lowercase” and data structures such as maps and sets no longer need to be defined. They can simply be referenced from the Infra Standard.

We would love your help improving the Infra Standard on GitHub. What language can further be deduplicated? What is common boilerplate in standards that needs to be made consistent and shared? What data types are missing? Please don’t hesitate to file an issue or write a pull request!

Posted in What's Next, WHATWG | Comments Off on Infra

Sunsetting the JavaScript Standard

Monday, August 15th, 2016

Back in 2012, the WHATWG set out to document the differences between the ECMAScript 5.1 specification and the compatibility and interoperability requirements for ECMAScript implementations in web browsers.

A specification draft was first published under the name of “Web ECMAScript”, but later renamed to just “JavaScript”. As such, the JavaScript Standard was born.

Our work on the JavaScript Standard consisted of three tasks:

figuring out implementation differences for various non-standard features;
filing browser bugs to get implementations to converge;
and finally writing specification text for the common or most sensible behavior, hoping it would one day be upstreamed to ECMAScript.

That day has come.

Some remaining web compatibility issues are tracked in the repository for the ECMAScript spec, which javascript.spec.whatwg.org now redirects to. The rest of the contents of the JavaScript Standard have been upstreamed into ECMAScript, Annex B.

This is good news for everyone. Thanks to the JavaScript Standard, browser behavior has converged, increasing interoperability; non-standard features got well-defined and standardized; and the ECMAScript standard more closely matches reality.

Highlights:

The infamous “string HTML methods”: String.prototype.anchor(name), String.prototype.big(), String.prototype.blink(), String.prototype.bold(), String.prototype.fixed(), String.prototype.fontcolor(color), String.prototype.fontsize(size), String.prototype.italics(), String.prototype.link(href), String.prototype.small(), String.prototype.strike(), String.prototype.sub(), and String.prototype.sup(). Browsers implemented these slightly differently in various ways, which in one case lead to a security issue (and not just in theory!). It was an uphill battle, but eventually browsers and the ECMAScript spec matched the behavior that the JavaScript Standard had defined.
Similarly, ECMAScript now has spec text for String.prototype.substr(start, length).
ECMAScript used to require a fixed, heavily outdated, version of the Unicode Standard for determining the set of whitespace characters, and what’s a valid identifier name. The JavaScript Standard required the latest available Unicode version instead. ECMAScript first updated the Unicode version number and later removed the fixed version reference altogether.
ECMAScript Annex B, which specifies things like escape and unescape, used to be purely informative and only there “for compatibility with some older ECMAScript programs”. The JavaScript Standard made it normative and required for web browsers. Nowadays, the ECMAScript spec does the same.
The JavaScript Standard documented the existence of HTML-like comment syntax (). As of ECMAScript 2015, Annex B fully defines this syntax.
The __defineGetter__, __defineSetter__, __lookupGetter__, and __lookupSetter__ methods on Object.prototype are defined in ECMAScript Annex B, as is __proto__.

So long, JavaScript Standard, and thanks for all the fish!

Posted in WHATWG | Comments Off on Sunsetting the JavaScript Standard

Defining the WindowProxy, Window, and Location objects

Thursday, May 12th, 2016

The HTML Standard defines how navigation works inside a browser tab, how JavaScript executes, what the overarching web security model is, and how all these intertwine and work together. Over the last decade, we’ve made immense progress in specifying previously-unspecified behavior, reverse-engineering and precisely documenting the de-facto requirements for a web-compatible browser. Nevertheless, there are still some corners of the web that are underspecified—sometimes because we haven’t yet discovered the incompatibility, and sometimes because specifying the behavior in a way that is acceptable to all implementers is really, really hard. What follows is an account of the latter.

Until recently, the HTML Standard lacked a precise definition of the WindowProxy, Window, and Location objects. As you might imagine, these are fairly important objects, so having them be underdefined was not great for the web. (Note that the global object used for documents is the Window object, though due the way browsers evolved it is never directly exposed. Instead, JavaScript code accesses the WindowProxy object, which serves as a proxy and security boundary for the Window object.)

Each navigable frame (top-level tab, <iframe> element, et cetera) is called a browsing context in the HTML Standard. A browsing context has an associated WindowProxy and Window object. As you navigate a browsing context, the associated Window object changes. But the whole time, the WindowProxy object stays the same. Ergo, one WindowProxy object is a proxy for many Window objects.

To make matters more interesting, scripts in these different browsing contexts can access each other, through frame.contentWindow, self.opener, window.open(), et cetera. The same-origin policy generally forbids code from one origin from accessing code from a different origin, which prevents evil.com from prying into bank.com. The two legacy exceptions to this rule are the WindowProxy and Location objects, which have some properties that can be accessed across origins.

document.domain makes this even trickier, as it effectively allows you to observe a WindowProxy or Location object as cross-origin initially, and same-origin later, or vice versa. Since the object remains the same during that time, the same-origin versus cross-origin logic needs to be part of the same object and cannot be spread across different classes.

As JavaScript has many ways to inspect objects, WindowProxy and Location objects were forced to be exotic objects and defined in terms of JavaScript’s “meta-object protocol”. This means they have custom internal methods (such as [[Get]] or [[DefineOwnProperty]]) that define how they respond to the low-level operations that govern JavaScript execution.

Defining this all in detail has been a multi-year effort spearheaded by Bobby Holley, Boris Zbarsky, Ian Hickson, Adam Barth, Domenic Denicola, and Anne van Kesteren, and completed in the “define security around Window, WindowProxy, and Location objects properly” pull request. The basic setup we ended up with is that WindowProxy and Location objects have specific cross-origin branches in their internal method implementation. These take care to only expose specific properties, and even for those properties, generating specific accessor functions per origin. This ensures that cross-origin access is not inadvertently allowed through something like Object.getOwnPropertyDescriptor(otherWindowProxy, "window").get. After filtering, a WindowProxy object will forward to its Window object as appropriate, whereas a Location object simply gives access to its own properties.

Having these objects defined in detail will make it easier for implementations to refactor, and for new novel implementations like Servo to achieve web-compatibility. It will reduce debugging time for web developers after implementations have converged on the edge cases. And it drastically simplifies extending these objects, as well as placing new restrictions upon them, within this well-defined subsystem. Well-understood, stable foundations are the key to future extensions.

(Many thanks to Bobby Holley for his contributions to this post.)

Posted in WHATWG | 3 Comments »

Adding JavaScript modules to the web platform

Wednesday, April 13th, 2016

One thing we’ve been meaning to do more of is tell our blog readers more about new features we’ve been working on across WHATWG standards. We have quite a backlog of exciting things that have happened, and I’ve been nominated to start off by telling you the story of <script type="module">.

JavaScript modules have a long history. They were originally slated to be finalized in early 2015 (as part of the “ES2015” revision of the JavaScript specification), but as the deadline drew closer, it became clear that although the syntax was ready, the semantics of how modules load each other were still up in the air. This is a hard problem anyway, as it involves extensive integration between the JavaScript engine and its “host environment”—which could be either a web browser, or something else, like Node.js.

The compromise that was reached was to have the JavaScript specification specify the syntax of modules, but without any way to actually run them. The host environment, via a hook called HostResolveImportedModule, would be responsible for resolving module specifiers (the "x" in import x from "x") into module instances, by executing the modules and fetching their dependencies. And so a year went by with JavaScript modules not being truly implementable in web browsers, as while their syntax was specified, their semantics were not yet.

In the epic whatwg/html#433 pull request, we worked on specifying these missing semantics. This involved a lot of deep changes to the script execution pipeline, to better integrate with the modern JavaScript spec. The WHATWG community had to discuss subtle issues like how cross-origin module scripts were fetched, or how/whether the async, defer, and charset attributes applied. The end result can be seen in a number of places in the HTML Standard, most notably in the definition of the script element and the scripting processing model sections. At the request of the Edge team, we also added support for worker modules, which you can see in the section on creating workers. (This soon made it over to the service workers spec as well!) To wrap things up, we included some examples: a couple for <script type="module">, and one for module workers.

Of course, specifying a feature is not the end; it also needs to be implemented! Right now there is active implementation work happening in all four major rendering engines, which (for the open source engines) you can follow in these bugs:

Blink: issue 594639
Gecko: bug 1240072
WebKit: bug 148897

And there's more work to do on the spec side, too! There's ongoing discussion of how to add more advanced dynamic module-loading APIs, from something simple like a promise-returning self.importModule, all the way up to the experimental ideas being prototyped in the whatwg/loader repository.

We hope you find the addition of JavaScript modules to the HTML Standard as exciting as we do. And we'll be back to tell you more about other recent important changes to the world of WHATWG standards soon!

Posted in Elements, Processing Model, WHATWG | 5 Comments »

Standards development happening on GitHub

Monday, February 1st, 2016

(This post is a rough copy of an email I sent to the mailing list.)

I wanted to remind the community that currently all WHATWG standards are being developed on GitHub. This enables everyone to directly change standards through pull requests and start topic-based discussion through issues.

GitHub is especially useful now that the WHATWG covers many more topics than “just” HTML, and using it has already enabled many folks to contribute who likely would not have otherwise. To facilitate participation by everyone, some of us have started identifying relative-easy-to-do issues across our GitHub repositories with the label “good first issue”. (See also good first bugs on Bugzilla. New issues go to GitHub, but some old ones are still on Bugzilla.) And we will also continue to help out with any questions on #whatwg IRC.

You should be able to find the relevant GitHub repository easily from the top of each standard the WHATWG publishes. Once you have a GitHub account, you can follow the development of a single standard using the “Watch” feature.

There are no plans to decommission the mailing list — but as you might have noticed, new technical discussion there has become increasingly rare. The mailing list is still a good place to discuss new standards, overarching design decisions, and more generally as a place to announce (new) things.

When there’s a concrete proposal or issue at hand, GitHub is often a better forum. IRC also continues to be used for a lot of day-to-day communications, support, and quick questions.

Posted in WHATWG | Comments Off on Standards development happening on GitHub