The WHATWG Blog — 2008

Archive for June, 2008

Experience the HTML5 parsing algorithm in the Live DOM Viewer

Monday, June 30th, 2008

If you’ve investigated how browsers parse HTML, you’ve probably used Hixie’s Live DOM Viewer to see what happens. Wouldn’t it be cool, though, if you could experiment with the HTML5 parsing algorithm in the same UI? Well, now you can.

I was looking for a way to experiment with document.write() in the code base of the Validator.nu HTML Parser and I was looking for a way to let people see the parse tree output of the HTML5 parsing algorithm more easily. Instead of writing a test harness fully in Java, I thought it would be better to use the Live DOM Viewer and a browser engine as the test harness. The good news is that Google Web Toolkit makes it possible to put these pieces together, and the trunk of the Validator.nu HTML parser now comes with a document.write()-aware tokenizer driver and a tree builder subclass for GWT.

The bad news is that the Java-to-JavaScript compiler of GWT has a bug that blocks me from putting the result online as JavaScript. The Hosted Mode of GWT, works, though.

Here’s how you can run the Validator.nu HTML Parser in the Live DOM Viewer locally in the Hosted Mode of GWT (on Mac or Linux):

Check out the source: svn co http://svn.versiondude.net/whattf/htmlparser/trunk/ htmlparser
Download and untar GWT 1.5 RC1
On Linux, install libstdc++5 and a JDK (Ubuntu's OpenJDK-based package worked for me).
Edit the paths in HtmlParser-shell (Mac) or HtmlParser-linux (Linux) to point to the location of GWT.
Run HtmlParser-shell (Mac) or HtmlParser-linux (Linux)

Known problems:

The Linux version of GWT runs an outdated version of Gecko, and the rendered view doesn't work. The DOM view does.
The Mac version of GWT runs a Web Inspector-enabled version of WebKit, but SVG does not draw.
document.write() semantics are right only for inline scripts.
Copying and pasting using keyboard shortcuts doesn’t work. (Use the context menu.)
On Linux, GTW prints a lot of harmless warnings about not finding annotations. (I don’t know why that happens. The annotations should be among translatables.)
Gecko (used by GTW on Linux) doesn't allow the creation of xmlns attributes in no namespace, so things stop working if you try to put an attribute called xmlns on HTML elements.
The DOM view on Linux doesn't report names with colons in them per the HTML5 spec.

(Aside: This code could have applicability beyond testing the parser. If the compiler bug were fixed or worked around, a script could document.write() a math element and an svg element to sniff if they are parsed according to HTML5 and if they aren't, move aside load event handlers, document.write() <plaintext style='display:none'>, wait until DOMContentLoaded, load the the already created html, head and body elements onto the tree builder stack and head pointer of the HTML5 parser to and reparse the content of the plaintext element as HTML5 and call the load event handlers. See Philip Taylor’s proof of concept with S-expressions.)

Posted in Syntax | 1 Comment »

HTML5 Presentation at @media 2008

Wednesday, June 18th, 2008

Lachlan Hunt and I recently gave a presentation entitled Getting Your Hands Dirty with HTML5 at the @media 2008 conference in London. The audience was mainly front-end developers; the kind of people who are using HTML to make a living, so it was a great chance to get the message out about some of the new features that have been under development.

The talk covered the Design Principles under which HTML5 is being developed, how some of the features of HTML5 can be used to enhance common web sites, and how people can get involved with the development of HTML5.

The presentation seemed to go reasonably well, especially given that we had not met till the morning of the talk although we did have fewer demos than I would have liked, both due to technical problems in the talk and a lack of time to prepare. So, for those who were at the talk (as well as those who were not), here are a somewhat random collection of demos of the HTML5 features we mentioned:

A video demo I wrote (requires a therora-supporting video enabled browser such as these preview builds of Firefox or Opera — both those pages have more, and probably better, demos).
Canvex; Philip Taylor's 3D game demo in Javascript + <canvas>
John Resig's Cross Document Messaging Demo
A simple document outline viewer to get a feel for how the new section elements work (note: doesn't seem to work in Safari for some reason)
Safari team's demo of the client side database storage (requires a recent webkit-based browser)

If anyone who saw the presentation is reading this and would like to provide constructive criticism on the talk, I would really appreciate it; giving talks is fun so it would be nice to get better at it 🙂

Posted in WHATWG | Comments Off on HTML5 Presentation at @media 2008

Offline Web Applications

Tuesday, June 3rd, 2008

Since HTML5 is a large specification Ian and I, being encouraged by Dan Connolly from the W3C, wrote an introductory document to the offline Web application features in HTML5 — Offline Web Applications — which the W3C published earlier today. In summarized form, it explains the SQL API, the offline application cache API, and some of the related APIs, such as online and offline events.

Posted in W3C | 7 Comments »