Archive for the ‘Conformance Checking’ Category
I see more and more people switch over to HTML5 these days, and to help you make sure you did things correctly, there are some tools at your disposal that might be good to know about.
To make sure you didn't misspell any tag or nest elements in a way that is not allowed, or find similar mistakes in your markup, you can use Validator.nu.
Alt text for images
The above-mentioned validator has a feature to help you quality-check your alternative text for your
img elements. Check the Show Image Report checkbox.
You can also disable images in your browser or try to use a text-only browser — the information that the images convey should still be available (but in text form). Sometimes an image doesn't convey any further information than what the surrounding text already says, and in such cases you should use the empty value:
For further advice and examples on how to use the
alt attribute, the HTML 5 spec has lots of information on the topic. If you're not going to read it all, just read the section called General guidelines.
The document outline is the structure of sections in the document, built from the
h6 elements as well as the new sectioning elements (
nav). The document outline is more commonly known as the Table of Contents.
To make sure that you have used the new sectioning elements correctly, you can check that the resulting outline makes sense with the HTML5 Outliner.
If you see "Untitled Section" and didn't expect them, chances are that you should have used
div instead of
If you have a subtitle of a heading that shouldn't be part of the document outline, you should use the
<h1>The World Wide Web Consortium</h1>
<h2>Leading the Web to Its Full Potential...</h2>
In this example, only the
h1 will show up in the document outline.
(This only applies to
table elements used for tabular data — not for layout.)
HTML tables have two types of cells: header cells (
th elements) and data cells (
td elements). These cells are associated together in the table: a data cell in the middle of the table can have associated header cells, typically in the first row and/or the first column of the table. To a user who can see, this association seems obvious, but users who cannot see need some help from the computer to understand which cells are associated with which.
You should mark up your header cells with the
th element and check that your cells get associated as you intended using the Table Inspector. If it isn't as you intended, you can consider simplifying or rearranging your table, or you can override the default association using
If you know about other tools for helping with quality assurance of HTML5, or if you have made your own, please share!
I gave a talk at Google on Monday demonstrating the various features of HTML5 that are implemented in browsers today. The video is now on YouTube, so now you too can watch and laugh at my lame presentation skills!
The segments of this talk are as follows. Some of the demos are available online for you to play with and are linked to from the following list:
- Drag and Drop API (29:05)
- Form Controls (40:50)
- Validation (1:07:20)
- Questions and Answers (1:09:35)
If you're very interested in watching my typos, the high quality version of the video on the YouTube site is clear enough to see the text being typed. More details about the demos can be found on the corresponding demo page.
Kai Hendry has written an HTML filetype plugin for Vim that allows you to use Henri Sivonen’s Validator.nu conformance checking (validation) service remotely to check the contents of any HTML document you edit in Vim and determine if the document is HTML5-conformant (valid).
The filetype plugin is also demo'ed in a screencast tutorial on editing Web applications that Kai has blogged about in a VIM IDE for Web applications posting on his blog (see the blog posting for a link to the video).
All that you need to do to install the Vim filetype plugin is to download the plugin source and save it into
~/.vim/ftplugin/html.vim. To use it to check a document, first do
:make within Vim, then use
:cnext and such to locate the errors (for more details, read the section of the Vim docs that relates to those commands.)
How and why it works
Vim has a set of “quickfix” commands that provide something that many development IDEs also have these days: A way to run a compiler or lint checker or other external tool on the contents of a file you are editing, and then to have any errors returned — along with the line and column numbers of the places in your file where the errors occur — as a list that you can then easily step through or jump through one-by-one and fix. It’s a very powerful feature.
Kai’s HTML filetype plugin provides a way to use Vim’s “quickfix” commands to do conformance checking of HTML5 files. The plugin is dead simple; it’s just two lines:
set makeprg=curl\ -s\ -F\ laxtype=yes\ -F\ parser=html5\
\ -F\ level=error\ -F\ out=gnu\ -F\ doc=@%\
(Note that I've just wrapped the first line for the purpose of readability in this post.)
makeprg option in the first line tells Vim what “make program” you want to use when checking HTML files. And the
errorformat option in the second line tells Vim the expected format of error messages from that “make program” — so that it can parse the error messages to get the line and column numbers of the places in your file where the errors occur (the meanings of the various parts of the string used in that
errorformat value are: %f, filename; %l, line number; %c, column number; %m, error message).
Interaction with Validator.nu
What Kai’s HTML filetype plugin does it to use as the “make program” the curl command-line HTTP client, and in turn, to have curl send a POST request to Validator.nu. The contents of that POST request are set by the parameters and values specified by the
-F options passed to curl. Essentially what this does is to emulate what would happen if you used the form-based interface at the Validator.nu website to manually set the values of the various form fields in that interface. (Note that wget could be probably used here (with different options) to do the same thing.)
What Validator.nu does in return is to send a response with the list of errors — in a format that allows the list of errors to be easily parsed by tools that have built-in support (like Vim’s “quickfix”) for reading error lists that are in a regular format and doing something with them.
GNU-formatted error output
In this case, since the
out=gnu parameter and value were passed to Validator.nu, the particular format in which Validator.nu returns the error list is the standard GNU error format that’s used by many applications (including that other editor, Emacs). This use case (enabling remote validation and error-evaluation with editing applications) is actually one of the main cases for which Henri added the GNU-formatted error-reporting option to Validator.nu.
Validator.nu + Vim = easy HTML5 conformance checking
The end result is that you get the error information back into Vim in a way that lets you more easily locate and fix the errors.
So setting just two options is all it takes in an editing application like Vim to enable Validator.nu to be used remotely like this (that is, to do integrated HTML5 conformance-checking and error-reporting within the editor). This seems to me to be a pretty good testament (another in a long list) to the utility of the Validator.nu service and to the foresight that’s gone into its design.
It guess it also says a lot about the utility of Vim and the foresight that’s gone into its design — but we all already know how great Vim is, right?
There have been lots and lots of e-mail on the public-html mailing list about making the
alt attribute syntactically required in HTML5. At the core of this debate is on one hand using HTML5 validators to send a strong message about accessibility and on the other hand of avoiding a situation where a simplified and idealistic strong message leads to behavior that is counterproductive considering the goal of making the Web accessible. As a policy debate, it is similar to abstinence-only sex education debates.
A validator is a computer program and cannot tell if a textual alternative is appropriate for a given image in a given context. That's why accessibility checking needs to be done by a person. A person may use a software tool to make the checking easier, but trusting on fully automated software to determine whether a page is accessible is misguided.
Given this basic problem, a policy that insists on the
alt attribute always being present doesn’t necessarily lead to accessibility. In fact, considering that syntactic correctness and accessibility are different evaluation axes both in terms of computability and in terms of how HTML authors (other than accessibility advocates) tend to view things (judging from observations about the behavior of HTML authors who use validators), a policy that insists on the
alt attribute being always present will likely cause people to put the attribute in there but with inappropriate content. In particular, putting an empty
alt on images whose presence is important for understanding the context of other content is bad, because in that case the presence of those images is concealed from a non-graphical user. Also, a textual alternative that just says “image” is not an improvement over what, for example, Safari with VoiceOver says in the absence of
alt, but would be worse than a smarter client-side heuristic.
Furthermore, there is a very real case where a textual alternative simply isn’t available to the HTML generator: a user uploads photos to a content management system and refuses to supply textual alternatives at the same moment. HTML 4 didn’t account for this case. In fact, requiring
alt to under all circumstances assumes that markup is written by a person who knows what the images are at the time of writing markup. It doesn’t make sense to pretend that the case where the markup generator doesn’t have textual alternatives available doesn’t exist. The HTML 5 syntax needs to account for all use cases.
Expecting markup generators to knowingly emit markup that is not valid is not a winning proposition. Quoting me from 2006:
Authoring tools are judged by taking a page authored using the
tool and running it through the W3C Validator or, presumably in the
future, through an HTML5 conformance checker. Authoring tool makers
are capable of making their tool produce syntactically conforming
documents will want to do so and minimize the chance that the users of
their software tarnish the reputation of the tool in the eyes of
who use an automated test as a litmus test of authoring tool bogosity.
(People who test tools that way will outnumber the people who make a
more profound analysis due to the "validate, validate, validate"
To summarize: As a matter of principle, subjective checking or checking that is not applicable for all pages does not belong in the validation function. Practice is more important than principle, though. Baking the
alt requirement into the validation function would be bad when the user of the validation function wants a clean report on syntax but isn’t as concerned with accessibility. It is bad for accessibility when authors put the simplest value that silences the validator into the attribute in order to make the validation report look clean, since doing so gives user agents like Safari with VoiceOver less information to work with. That's why I think the requirement to have an
alt attribute present doesn’t belong in the validation function also as a practical matter.
It turns out, though, that some people think of validation as a first step toward accessibility, even though syntactic correctness and accessibility really are different evaluation axes. They expect a validator to help them flag images that are lacking a textual alternative. Moreover, the
alt issue seems to be taken as the single most important web accessibility issue with the rest of issues somewhere in the long tail. When there is a demand for validators to flag images without
alt, validators probably should meet the demand.
To this end, I have developed a new feature for Validator.nu: Image Report. This new feature is not part of the validation function. It also doesn’t do exactly want people are asking of the syntax definition in the long e-mail thread. (It is not a new idea for a validator user interface to offer tools that help a human perform an assessment about the page outside the validation function. For example, the W3C Validator has offered a “Show Document Outline” feature, which is also on file as a request for enhancement for Validator.nu.)
The new feature tries to address the issue of finding missing textual alternatives but it also seeks to address the issue of faulty textual alternatives. Furthermore, it seeks to address these in a way that doesn’t induce people to write bad textual alternatives in order to make the report look cleaner.
When you turn the feature on, it always lists all the images. There is no textual alternative you can fake to make the list look shorter. Instead, there are four categories and you can only change the category in which an image appears.
This has the benefit of removing the badge hunting problem: people trying to silence the validator without actually raising the quality of their page. However, it also has the benefit that the user can review the textual alternatives for appropriateness and the user can review that the right images have been marked as omitted from non-graphical presentation. Since this tool addresses more problems than simply making
alt required on the syntax level, I believe this solution is much better than furiously staying entrenched in the status quo of HTML 4 validation, fearing so much a step backwards as to being too afraid to explore steps forward.
Finally, it should be noted that this feature is, by necessity, itself inaccessible to people who cannot view bitmap images. Yet, I think it is legitimate for this feature to be implemented with an HTML user interface. Also, this feature itself is a case where the generator of the user interface markup has no knowledge of the content of the images it is presenting to the user. Hence, it is itself an example of omitting the
alt attribute. It would be truly ironic, if the syntax definition of HTML5 prevented Validator.nu from being self-validating.
Due to implementation details, the HTML5 facet of Validator.nu used to ignore the content of obsolete elements such as
center, because obsolete elements were simply unknown. This wasn’t particularly useful when assessing the HTML5-upgradeability of an existing design that wrapped everything in
center, for example.
The HTML5 facet of Validator.nu now knows about obsolete container elements that existed as deprecated in HTML 4.01. This means that
center is still an error, but the contents are now checked as HTML5.
Also, Validator.nu now allows legacy-style internal encoding declarations per the latest Editor’s Draft.