Please leave your sense of logic at the door, thanks!

Author Archive HTML Parser 1.4 Available

Friday, June 8th, 2012

A new release of the HTML Parser is available. The new version 1.4 contains minor adjustments to spec compliance and fixes for notable Java-specific problems (of the crash and infinite loop sort). Also, the parser is again available from the Maven Central Repository (groupId: nu.validator.htmlparser, artifactId: htmlparser, version: 1.4).

Upgrading to the newest version is recommended for all users of all previous versions.


Posted in WHATWG | 1 Comment » HTML Parser Version 1.3.1 Released

Wednesday, March 9th, 2011

There is now a new release of the HTML Parser. The new release contains files that were missing from the previous release package by accident. It also contains one tree builder correctness fix and one error reporting improvement.

Posted in Syntax | Comments Off on HTML Parser Version 1.3.1 Released

Saturday, January 22nd, 2011

As support for WebM is ramping up, Web authors can start using it. However, since not everyone has a WebM-enabled browser, yet, using WebM on your site poses the problem of having to explain to the visitors of your site how they can view WebM. It is inefficient for everyone to have to do this from scratch on their sites. Also, chances are that per-site help text will be incomplete and out of date soon.

To address this problem, with hosting and domain name help from Anne van Kesteren, I have made as a place to pool the effort. When you publish WebM content, instead of explaining which browsers support WebM, you can simply link to and it will detect if the user’s browser supports WebM. If the browser doesn’t support WebM, the page will suggest upgrading the browser to a new version that supports WebM, installing a WebM decoder if the browser supports 3rd-party decoders and one is available, switching to another browser or using another operating system (as applicable and in that order).

The dull visual appearance of the page is a known problem. Visual design isn’t my strong point. I have also avoided using logos without permission. If you’d like to contribute nicer CSS or a nicer-looking (but still short and on-topic) test clip, please find hsivonen on the #whatwg IRC channel on Freenode. Also, if you can contribute accurate advice for platforms that aren’t already covered (e.g. FreeBSD, AIX or OS/2), please drop a line on IRC or in the comments here. (You can view source on to see what is already covered.)

Posted in WHATWG | 16 Comments »

Version 1.3 of the HTML Parser Released

Thursday, January 13th, 2011

After over a year without proper releases, there is now a new release of the HTML Parser. There have been numerous changes to the HTML5 spec and, consequently, to the parser since the previous release. All users of the parser should update to the latest release in order to run a version that corresponds to the current spec.

Posted in Syntax | Comments Off on Version 1.3 of the HTML Parser Released

Spelling HTML5

Thursday, September 10th, 2009

What’s the right way to spell “HTML5”? The short answer is: “HTML5” (without a space).

People in the WHATWG community have commonly referred to HTML5 as “HTML5” for quite a while. However, when the W3C HTML WG voted on adopting “Web Applications 1.0” the question about the title said “HTML 5”. Thus, the W3C HTML WG voted to adopt “HTML 5” as the title, but it wasn’t a vote for or against the space but about “HTML” and “5” in contrast to e.g. “Web Applications 1.0”. Anyway, as a result, the spec was retitled literally “HTML 5”.

This lead to inconsistency. Sometimes people kept writing “HTML5” and sometimes “HTML 5” (even on This kind of inconsistency is bad for branding. The Super Friends pointed this issue out as the first thing they pointed out.

Now both the WHATWG Draft Standard and W3C Editor’s Draft spell it “HTML5”.

Posted in WHATWG | 29 Comments »