The WHATWG Blog — 2008

Archive for April, 2008

Reverse Ordered Lists

Wednesday, April 23rd, 2008

One of the newly introduced features in HTML 5 is the ability to mark up reverse ordered lists. These are the same as ordered lists, but instead of counting up from 1, they instead count down towards 1. This can be used, for example, to count down the top 10 movies, music, or LOLCats, or anything else you want to present as a countdown list.

In previous versions of HTML, the only way to achieve this was to place a value attribute on each li element, with successively decreasing values.

<h3>Top 5 TV Series</h3>
<ol>
  <li value="5">Friends
  <li value="4">24
  <li value="3">The Simpsons
  <li value="2">Stargate Atlantis
  <li value="1">Stargate SG-1
</ol>

The problem with that approach is that manually specifying each value can be time consuming to write and maintain, and the value attribute was not allowed in the HTML 4.01 or XHTML 1.0 Strict DOCTYPEs (although HTML 5 fixes that problem and allows the value attribute)

The new markup is very simple: just add a reversed attribute to the ol element, and optionally provide a start value. If there’s no start value provided, the browser will count the number of list items, and count down from that number to 1.

<h3>Greatest Movies Sagas of All Time</h3>
<ol reversed>
  <li>Police Academy (Series)
  <li>Harry Potter (Series)
  <li>Back to the Future (Trilogy)
  <li>Star Wars (Saga)
  <li>The Lord of the Rings (Trilogy)
</ol>

Since there are 5 list items in that list, the list will count down from 5 to 1.

The reversed attribute is a boolean attribute. In HTML, the value may be omitted, but in XHTML, it needs to be written as: reversed="reversed".

The start attribute can be used to specify the starting number for the countdown, or the value attribute can be used on an li element. Subsequent list items will, by default, be numbered with the value of 1 less than the previous item.

The following example starts counting down from 100, but omits a few items from the middle of the list and resumes from 3.

<h3>Top 100 Logical Fallacies Used By Creationists</h3>
<ol reversed="reversed" start="100">
  <li>False Dichotomy</li>
  <li>Appeal to Ridicule</li>
  <li>Begging the Question (Circular Logic)</li>
  <!-- Items omitted here -->
  <li value="3">Strawman</li>
  <li>Bare Assertion Fallacy</li>
  <li>Argumentum ad Ignorantiam</li>
</ol>

Posted in Elements, Syntax | 26 Comments »

New Image Report Feature in Validator.nu

Friday, April 18th, 2008

There have been lots and lots of e-mail on the public-html mailing list about making the alt attribute syntactically required in HTML5. At the core of this debate is on one hand using HTML5 validators to send a strong message about accessibility and on the other hand of avoiding a situation where a simplified and idealistic strong message leads to behavior that is counterproductive considering the goal of making the Web accessible. As a policy debate, it is similar to abstinence-only sex education debates.

A validator is a computer program and cannot tell if a textual alternative is appropriate for a given image in a given context. That's why accessibility checking needs to be done by a person. A person may use a software tool to make the checking easier, but trusting on fully automated software to determine whether a page is accessible is misguided.

Given this basic problem, a policy that insists on the alt attribute always being present doesn’t necessarily lead to accessibility. In fact, considering that syntactic correctness and accessibility are different evaluation axes both in terms of computability and in terms of how HTML authors (other than accessibility advocates) tend to view things (judging from observations about the behavior of HTML authors who use validators), a policy that insists on the alt attribute being always present will likely cause people to put the attribute in there but with inappropriate content. In particular, putting an empty alt on images whose presence is important for understanding the context of other content is bad, because in that case the presence of those images is concealed from a non-graphical user. Also, a textual alternative that just says “image” is not an improvement over what, for example, Safari with VoiceOver says in the absence of alt, but would be worse than a smarter client-side heuristic.

Furthermore, there is a very real case where a textual alternative simply isn’t available to the HTML generator: a user uploads photos to a content management system and refuses to supply textual alternatives at the same moment. HTML 4 didn’t account for this case. In fact, requiring alt to under all circumstances assumes that markup is written by a person who knows what the images are at the time of writing markup. It doesn’t make sense to pretend that the case where the markup generator doesn’t have textual alternatives available doesn’t exist. The HTML 5 syntax needs to account for all use cases.

Expecting markup generators to knowingly emit markup that is not valid is not a winning proposition. Quoting me from 2006:

Authoring tools are judged by taking a page authored using the tool and running it through the W3C Validator or, presumably in the future, through an HTML5 conformance checker. Authoring tool makers who are capable of making their tool produce syntactically conforming documents will want to do so and minimize the chance that the users of their software tarnish the reputation of the tool in the eyes of people who use an automated test as a litmus test of authoring tool bogosity. (People who test tools that way will outnumber the people who make a more profound analysis due to the "validate, validate, validate" propaganda.)

To summarize: As a matter of principle, subjective checking or checking that is not applicable for all pages does not belong in the validation function. Practice is more important than principle, though. Baking the alt requirement into the validation function would be bad when the user of the validation function wants a clean report on syntax but isn’t as concerned with accessibility. It is bad for accessibility when authors put the simplest value that silences the validator into the attribute in order to make the validation report look clean, since doing so gives user agents like Safari with VoiceOver less information to work with. That's why I think the requirement to have an alt attribute present doesn’t belong in the validation function also as a practical matter.

It turns out, though, that some people think of validation as a first step toward accessibility, even though syntactic correctness and accessibility really are different evaluation axes. They expect a validator to help them flag images that are lacking a textual alternative. Moreover, the alt issue seems to be taken as the single most important web accessibility issue with the rest of issues somewhere in the long tail. When there is a demand for validators to flag images without alt, validators probably should meet the demand.

To this end, I have developed a new feature for Validator.nu: Image Report. This new feature is not part of the validation function. It also doesn’t do exactly want people are asking of the syntax definition in the long e-mail thread. (It is not a new idea for a validator user interface to offer tools that help a human perform an assessment about the page outside the validation function. For example, the W3C Validator has offered a “Show Document Outline” feature, which is also on file as a request for enhancement for Validator.nu.)

The new feature tries to address the issue of finding missing textual alternatives but it also seeks to address the issue of faulty textual alternatives. Furthermore, it seeks to address these in a way that doesn’t induce people to write bad textual alternatives in order to make the report look cleaner.

When you turn the feature on, it always lists all the images. There is no textual alternative you can fake to make the list look shorter. Instead, there are four categories and you can only change the category in which an image appears.

This has the benefit of removing the badge hunting problem: people trying to silence the validator without actually raising the quality of their page. However, it also has the benefit that the user can review the textual alternatives for appropriateness and the user can review that the right images have been marked as omitted from non-graphical presentation. Since this tool addresses more problems than simply making alt required on the syntax level, I believe this solution is much better than furiously staying entrenched in the status quo of HTML 4 validation, fearing so much a step backwards as to being too afraid to explore steps forward.

Finally, it should be noted that this feature is, by necessity, itself inaccessible to people who cannot view bitmap images. Yet, I think it is legitimate for this feature to be implemented with an HTML user interface. Also, this feature itself is a case where the generator of the user interface markup has no knowledge of the content of the images it is presenting to the user. Hence, it is itself an example of omitting the alt attribute. It would be truly ironic, if the syntax definition of HTML5 prevented Validator.nu from being self-validating.

Posted in Conformance Checking, Syntax | 4 Comments »

Validator.nu HTML Parser 1.0.7 Released

Saturday, April 5th, 2008

There is now a new release of the Validator.nu HTML Parser. Change highlights:

Adds optional support for heuristic encoding sniffing using the ICU4J sniffer, jchardet or both.
Adds support for rewinding and reparsing when becoming confident about the character encoding and the tentative encoding was wrong.
Performs encoding name matching per spec instead of using the JDK mechanism.
Implements spec changes up until just before SVG and MathML support. (Those will merit 1.1 or something.)
Warning: The semantics of the doctype token have changed in case you have your own token handler (unlikely).

Posted in Processing Model, Syntax | Comments Off on Validator.nu HTML Parser 1.0.7 Released