Archive for March, 2007
html5lib 0.9 is now available for your parsing pleasure.
html5lib is an implementation of the WHATWG HTML parsing algorithm in Python and released under a MIT-license It enables malformed HTML to be parsed into standard minidom and ElementTree structures,in a way that is highly compatible with the behavior of major desktop web browsers. As well as parsing to trees html5lib contains a DOM to SAX converter; it is hoped that by supporting these standard APIs, toolchains based on draconian XML parsers can be repurposed to process HTML content with minimal effort.
In addition to the HTML parsing capability, html5lib 0.9 contains an experimental liberal XML parser based on the WHATWG algorithm without the HTML-specific error handling. This is suitable for parsing XML from sources that cannot guarantee wellformedness; e.g. web feeds.
The 0.9 release is expected to be the last major release before 1.0 and no new features will be added before 1.0 is released. Instead we will work on any remaining correctness issues, other bugs, and on improving the messages reported when parse errors are encountered. Bug reports are very much appreciated. Users or people looking to get involved are encouraged to join the mailing list or visit the #WHATWG channel on freenode.net
this post refers to the "Write" interface from WordPress utilized to post comments to WHATWG blogs.
there is well-intentioned, but mis-implemented markup in the edit form; namely, improper implemetation of the
for a proper
FIELDSET, one needs to do 4 things:
- open the
FIELDSET (which this form does)
- define a
LEGEND for the
FIELDSET (which this form does NOT); the natural candidates for
LEGEND are the level 3 headers (
H3) classed dbx-handle so instead of repeatedly hearing "click to open this box", i would also get the pseudo-box (which i would call sub-forms)
LEGEND as an indicator of what i am about to open or close. i would also make the alt text device independent - instead of "click here to open this box", i would propose "show sub-form" and "hide sub-form"
- bind individual
FORM controls to their textual labels by use of the
LABEL element and the
for/id mechanism that ties the form control (which takes the "id") to a
LABEL (which takes the "for") or multiple labels; the
LABEL should contain the actual, textual label, and NOT the
FORM control, as in this form; this form has the attribute set set correctly to bind the
LABEL to the
FORM control, but since the
LABEL element is opened PRIOR to the
INPUT element, no labeling is available to the user - in my case (i use a screen-reader) the sub-forms that appear when one opens a
FIELDSET to reveal a
FORM appear unlabeled to my screenreader, because of invalid markup.
- close the
FIELDSET (which this form does)
The W3C today publicly announced that they are restarting an HTML specification effort. This is great news and a clear validation of the WHATWG effort, which has been leading the maintenance and development of HTML since 2004.
Surprisingly, the W3C never actually contacted the WHATWG during the chartering process. However, the WHATWG model has clearly had some influence on the creation of this group, and the charter says that the W3C will try to "actively pursue convergence with WHATWG". Hopefully they will get in contact soon.
In the meantime, apparently anyone can actually join the W3C effort. The instructions to join the group are as follows:
- Fill in the Public Access Request Form; in the "Reason" field, put: "To apply for participation in the HTML Working Group as an Invited Expert."
- Within about five minutes you'll receive a confirmation code by e-mail. Follow the instructions in that e-mail.
- You should get a reply back from that within two days, giving you a username and password. Fill in the W3C Invited Expert Application form. Under "Financial Support", if you're not going to attend any meetings or if you're going to attend meetings on your own dime, just put "Self-supported". Under "Possible W3C Membership", if you're employed but your employer doesn't know you're doing this, or doesn't care, just pick "My employer does not intend to join".
- E-mail Dan Connolly and
Karl Dubost (email@example.com, firstname.lastname@example.org) asking for approval. (Just say "Hi, I'd like to join the HTML working group. Thanks.")
- You should get a reply back within about ten days, at which point you can fill in the Joining the HTML Working Group form.
I would encourage everyone interested in working with the HTML working group to go through these steps as soon as possible, so that you will be a member of the group before the work starts.
Joining the group doesn't commit you to anything (e.g. you won't have to attend meetings or anything if you don't want to). The group's charter clearly says that all decisions will be made in ways that don't require attending meetings.
This post has been updated a few times to take into account new information about how to join the group.