The WHATWG Blog

Please leave your sense of logic at the door, thanks!

Author Archive

Microdata (part 1)

Friday, July 31st, 2009

One of the features we've added in HTML5 is a way to include machine-readable annotations that people can scrape in a simple and well-defined way. This means that if a site wants to make the information available, you don't have to rely on brittle screen-scraping to get the information out.

This is easiest to understand with an example.

Suppose that you had an issue tracking database like Bugzilla, and that you wanted other tools to be able to pull information about issues in that database.

Today, Bugzilla exposes an XML file for each bug, but this means maintaining two parallel formats for the bug page. Instead of providing such a separate interface, you can use microdata, the new attributes in HTML5. That way, even as your issue tracker changes its interface from version to version, the underlying data can still be reliably readable from the same HTML page.

Imagine the markup today looks like this:

<body>
 <h1>Issue 12941: Too many pies in the pie factory</h1>
 <dl>
  <dt>Reporter</dt>
  <dd>ian@hixie.ch</dd>
  <dt>Priority</dt>
  <dd>AAA</dd>
  ...

To annotate this with microdata, we just mint some names, and then label each field with those names. The names are in "reverse-DNS" form; if the bug system was at "example.net", then the names would be "net.example.bug", "net.example.number", and so on. Thus we get:

<body item="net.example.bug">
 <h1>Issue <span itemprop="net.example.number">12941</span>:
  <span itemprop="net.example.title">Too many pies in the pie factory</span></h1>
 <dl>
  <dt>Reporter</dt>
  <dd itemprop="net.example.reporter">ian@hixie.ch</dd>
  <dt>Priority</dt>
  <dd itemprop="net.example.priority">AAA</dd>
  ...

The item="net.example.bug" attribute says "here is a bug". The various itemprop attributes provide name/value pairs for the bug. The snippet above would result in the following tree of data:

net.example.bug:
  net.example.number = "12941"
  net.example.title = "Too many pies in the pie factory"
  net.example.reporter = "ian@hixie.ch"
  net.example.priority = "AAA"

Now it doesn't matter if the page is dramatically changed, the same data can still be made unambiguously available:

<body>
 <h1>Example.Net Bugs Database</h1>
 <section item="net.example.bug">
  <h1 itemprop="net.example.title">Too many pies in the pie factory</span></h1>
  <p>#<span itemprop="net.example.number">12941</span>; reported
  by <span itemprop="net.example.reporter">ian@hixie.ch</span>.</p>
  <p>PRIORITY: <strong itemprop="net.example.priority">AAA</strong>.</p>
  ...

This concludes this brief introduction to microdata! Some future blog posts will introduce a few aspects of microdata that I didn't discuss here:

Posted in WHATWG | 19 Comments »

Help us review HTML5!

Thursday, April 2nd, 2009

Are you interested in reviewing HTML5 for errors?

  1. Jump in! All feedback is welcome, from anyone.
  2. Open the specification: either the one-page version, or the multipage version or the PDF copy (A4, Letter)
  3. Start reading! See below for ideas of what to look for.

If you find a problem, either send an e-mail to the WHATWG list (whatwg@whatwg.org, subscription required), file a bug (registration required), send an e-mail to the public-html-comments@w3.org list (no subscription required), or send an e-mail directly to ian@hixie.ch.

If everything goes according to plan, all issues will get a response from the editor before October. You can track how many issues remain to be responded to on our graph.

What to look for

The plan is to see whether we can shake down the spec and get rid of all the minor problems that have so far been overlooked. Typos, confusion, cross-reference errors, as well as mistakes in examples, errors in the definitions, and major errors like security bugs or contradictions.

Anyone who helps find problems in the spec — however minor — will get their name in the acknowledgements section.

You don't really need any experience to find the simplest class of problems: things that are confusing! If you don't understand something, then that's a problem. Not all the introduction sections and examples are yet written, but if there is a section with an introduction section that isn't clear, then you've found an issue: let us know!

Something else that would now be good to search for is typos, spelling errors, grammar errors, and the like. Don't hesitate to send e-mails even for minor typos, all feedback even on such small issues is very welcome.

If you have a specific need as a Web designer, then try to see if the need is met. If it isn't, and you haven't discussed this need before, then send an e-mail to the list. (So for example, if you want HTML to support date picker widgets, you'd look in the spec to see if it was covered. As it turns out, that one is!)

If you have some specific expertise that lets you review a particular part of the spec for correctness, then that's another thing to look for. For example if you know about graphics, then reviewing the 2D Canvas API section would be a good use of your resources. If you know about scripting, then looking at the "Web browsers" section would be a good use of your time.

Staying in touch

You are encouraged to join our IRC channel #whatwg on Freenode to stay in touch with what other people are doing, but this is by no means required. You are also encouraged to post in the Discussion section on the wiki page for this review project, or in the blog comments below, to let people know what you are reviewing. You can get news updates by following @WHATWG on Twitter.

Posted in WHATWG | 17 Comments »

Google Tech Talk: HTML5 demos

Friday, September 26th, 2008

I gave a talk at Google on Monday demonstrating the various features of HTML5 that are implemented in browsers today. The video is now on YouTube, so now you too can watch and laugh at my lame presentation skills!

The segments of this talk are as follows. Some of the demos are available online for you to play with and are linked to from the following list:

  1. Introduction
  2. <video> (00:35)
  3. postMessage() (05:40)
  4. localStorage (15:20)
  5. sessionStorage (21:00)
  6. Drag and Drop API (29:05)
  7. onhashchange (37:30)
  8. Form Controls (40:50)
  9. <canvas> (56:55)
  10. Validation (1:07:20)
  11. Questions and Answers (1:09:35)

If you're very interested in watching my typos, the high quality version of the video on the YouTube site is clear enough to see the text being typed. More details about the demos can be found on the corresponding demo page.

Posted in Browser API, Browsers, Conformance Checking, DOM, Elements, Events, Forms, Multimedia, Syntax, WHATWG | 7 Comments »

Exploring new vocabularies for HTML

Monday, March 24th, 2008

The four hottest topics in the WHATWG Issues List are:

The video codec issue is being actively worked on, but we're not close to a good solution yet (it's mostly an economic and political issue, not a technical one, which is why we don't have any transparency on this issue, sadly). I recently responded to most of the table-related feedback. Web Forms 2 work is waiting for a decision from the W3C's forms task force on whether WF2 will be integrated as-is into HTML5 or whether it will be changed before being merged. The namespace issue is the one I'm working on now.

The first thing I have to do is work out what the problem is! There has been a lot of discussion, but not much of it is focussed on a problem, most of it is focussed on possible solutions. One can't evaluate a solution without knowing what it's trying to solve, though. To this end, I have created a wiki page where I will note down any problem descriptions I can find as I read all 367 of the e-mails in this folder.

Feel free to help! If you want to coordinate, I'm Hixie in #whatwg on Freenode IRC.

Posted in WHATWG | Comments Off on Exploring new vocabularies for HTML

The WHATWG at the W3C technical plenary

Wednesday, November 7th, 2007

The W3C is having its technical plenary day today, and a number of WHATWG contributors are there. It's hard to participate remotely in this event, but you can watch and listen — the W3C is publishing an audio stream (in Ogg; a Java applet alternative is available too), and has commissioned realtime captioning for the event. There's also W3C IRC channel on the topic on irc.w3.org, port 6665, channel #tp, password beantown * (a single asterisk) (it's not clear why there's a password, just go with it) (no password anymore). You can also chat with WHATWG contributors who are present at the event on our own IRC channel.

The agenda for the day is available from the W3C site. Don't forget to adjust the times from the Boston timezone to your timezone if you want to listen to a particular session.

Posted in W3C, WHATWG | Comments Off on The WHATWG at the W3C technical plenary