The WHATWG Blog — Processing Model

Archive for the ‘Processing Model’ Category

Dlaczego atrybut alt mo?na pomin??

Sunday, January 13th, 2008

This is a Polish translation of this article: Why the Alt Attribute May Be Omitted.

Prace prowadzone ostatnio nad specyfikacj? atrybutu alt maj? na celu gruntown? popraw? jego definicji, m.in. dok?adne wyja?nienie sposobów tworzenia poprawnego tekstu zast?pczego oraz jasne sprecyzowanie wymaga? autorskie.

Wymagania te okre?laj? sytuacje, w których konieczne jest u?ycie tekstu zast?pczego, zastosowanie pustego atrybutu alt oraz, co najbardziej zaskakuj?ce, kiedy atrybut alt mo?na ca?kowicie pomin??. Jest to kwestia kontrowersyjna, poniewa? na pierwszy rzut oka wygl?da to na prób? zach?cania do z?ego i sprzecznego z zasadami dost?pno?ci zwyczaju pomijania atrybutu alt, co wydaje si? by? kolejnym policzkiem dla dost?pno?ci w sieci. Jest to niew?a?ciwe rozumowanie, które nale?y przeanalizowa? ze szczególn? uwag? tak, aby rozwia? wszelkie w?tpliwo?ci, jakie mog? powsta?. Cho? taka sytuacja wydawa? si? mo?e uwstecznieniem jest to w rzeczywisto?ci bardzo pozytywny zabieg.

W wielu sytuacjach tekst zast?pczy jest po prostu niedost?pny i nic nie mo?na na to poradzi?. Przyk?adowo, wi?kszo?? u?ytków serwisów wymiany zdj?? takich jak Flickr nie mia?oby poj?cia jak, ani dlaczego, nale?y do??czy? tekst zast?pczy, nawet gdyby Flickr dawa? im tak? mo?liwo??. Chocia? wszyscy s? zgodni co do tego, ?e wspaniale by?oby gdyby wszyscy u?ytkownicy stosowali tekst zast?pczy (specyfikacja wyra?nie to zaleca), to wi?kszo?? z nich po prostu tego nie zrobi.

Nale?y zastanowi? si? nad problemem co zrobi? w sytuacji kiedy tekst alt jest niedost?pny i nie ma tak naprawd? sposobu ?eby go wstawi?. Przy obecnych wymaganiach stosowania atrybutu alt w HTML4 zaobserwowa? mo?na próby spe?nienia tego wymagania przez systemy, które podejmuj? prób? utworzenia tekstu zast?pczego w oparciu o metadane obrazu.

Flickr na przyk?ad powtarza tytu? obrazu; Photobucket najwyra?niej ??czy ze sob? nazw? pliku, jego tytu? i nazw? autora; z kolei Wikpedia niepotrzebnie powtarza podpis pod obrazem. Problemem wynikaj?cym z takiego podej?cia jest to, ?e stosowanie takich warto?ci nie dostarcza ani dodatkowych ani u?ytecznych informacji dotycz?cych obrazu, co w niektórych przypadkach jest gorsze ni? ca?kowity brak tekstu zast?pczego.

Korzy?ci? p?yn?c? z wymogu opuszczenia atrybutu alt zamiast pozostawienia po prostu pustej warto?ci jest jasne rozró?nienie pomi?dzy obrazem, który nie posiada tekstu zast?pczego (jak np. reprezentacja otaczaj?cego tekstu w postaci grafiki lub ikony) a obrazem b?d?cym kluczowym elementem zawarto?ci, dla którego tekst zast?pczy nie jest dost?pny. Podobno Lynx i Opera stosuj? ju? takie rozró?nienie. Przy obrazach bez atrybutu alt Lynx wy?wietla nazw? pliku a Opera pokazuje napis "Obraz", jednak ?adne z nich nie wy?wietla niczego przy obrazach z pustym atrybutem alt. Wci?? niewiadomo do ko?ca czy takie rozró?nienie jest naprawd? u?yteczne oraz czy przegl?darki mog? realistycznie je stosowa?. Kwestia ta jest otwarta do dyskusji je?eli tylko kto? dysponuje argumentami.

Sugeruje si? te?, ?e zrezygnowanie z bezwarunkowej konieczno?ci stosowania atrybutu alt wp?ynie na zdolno?? walidatorów do powiadamiania autorów o b??dach i odbierze nam narz?dzie pomocne w promowaniu dost?pno?ci. Jednak wykorzystywanie komunikatów o b??dach walidacji jako narz?dzia o?wiatowego nie jest ani jedynym ani najlepszym rozwi?zaniem problemu.

O ile autorzy lubi? wiedzie? kiedy przypadkowo pomin?li atrybut alt, to bezwarunkowe wymuszanie u?ycia tego atrybutu przy wykorzystaniu tak prymitywnego narz?dzia jakim jest walidator daje dok?adnie przeciwne wynik, poniewa? zach?ca do korzystania z generowanych automatycznie tekstów kiepskiej jako?ci. Zreszt? nic nie powstrzyma narz?dzi autorskich i sprawdzaj?cych zgodno?? ze standardami przed powiadomieniem autorów je?li b?d? sobie tego ?yczy?.

Przyznaj?c, ?e nie da si? zmusi? ka?dego do stosowania tekstu zast?pczego i czyni?c atrybut alt opcjonalnym w standardach dokumentu nie traci si? ?adnych praktycznych korzy?ci p?yn?cych z dost?pno?ci. Nikt nie twierdzi, ?e zgodno?? z HTML5 jest tym samym co zgodno?? ze wymogami dost?pno?ci. Wiele rzeczy uwa?a si? za spe?niaj?ce techniczne wymogi HTML, a jednak ich niew?a?ciwe stosowanie czyni je niedost?pnymi. Uczynienie atrybutu alt opcjonalnym nie jest sprzeczne z wymogami dost?pno?ci ani nie ma wielkiego wp?ywu na ich propagowanie. Opisuj? tu tylko rzeczywisto?? maj?c przy tym nadziej? na zmniejszenie powszechno?ci automatycznie generowanego tekstu alt kiepskiej jako?ci.

Posted in Browsers, Elements, Processing Model | 1 Comment »

Pourquoi le texte alternatif peut être omis (French)

Thursday, August 23rd, 2007

This article is a French translation of the article Why the Alt Attribute May Be Omitted.

La spécification de l'attribut alt a été retravaillée récemment, afin d'améliorer sa définition, en incluant une explication en profondeur de comment fournir le texte alternatif le plus approprié avec de réelles exigences éditoriales.

La spécification décrit des situations où le texte alternatif doit être précisé, où un attribut alt vide doit être utilisé et où, de façon plus sujette à controverse, l'attribut alt peut parfaitement être omis. Cette omission peut sembler être sujette à controverse, parce que au premier regard, cela ressemble à une tentative pour justifier la mauvaise pratique, contraire aux principes de l'accessibilité, qui consiste à oublier l'attribut alt ... Et à jeter un pavé dans la mare. C'est une confusion malheureuse qui nécessite qu'on s'y arrête un instant pour balayer les doutes qu'elle pourrait susciter chez bien des gens. Bien que cela puisse paraître rétrograde, la situation est ainsi bien meilleure.

Il y a bien des cas où le texte alternatif est tout simplement indisponible et où il y a peu de choses que l'on puisse faire pour remédier à la situation. Par exemple, la plupart des utilisateurs de sites de partage de photos comme Flickr n'auraient certainement aucune idée de quoi écrire comme texte alternatif, même si Flickr leur en offrait la possibilité. Et même si, bien sûr, tout le monde s'accorde à dire que si ce serait formidable si, comme la spécification l'encourage, tous les utilisateurs le faisaient, la plupart ne le feront tout simplement pas.

Le problème que nous venons de soulever est le suivant : que devons-nous faire dans le cas où aucun texte alternatif n'a été spécifié et où il reste virtuellement impossible à définir ? De nombreux systèmes actuels tentent de satisfaire la recommandation actuelle en matière de texte alternatif en générant ce texte à partir des métadonnées des images.

Flickr, par exemple, répète le titre de l'image ; Photobucket semble combiner le nom du fichier image, le titre et le nom de l'utilisateur - et Wikipédia reproduit de façon redondante la légende de l'image. Le problème de ces approches est qu'aucune d'entre elles ne fournit une information additionnelle utile à propos de l'image et, dans certains cas, cela est pire que de ne fournir aucun texte alternatif.

Le bénéfice que l'on peut retirer de permettre l'omission du texte alternatif, plutôt que de nécessiter une valeur vide est que cela permet de créer une distinction claire entre une image qui n'a pas de texte alternatif (comme une icône où une représentation graphique du texte environnant) et une image qui fait partie intégrante du contenu, mais pour laquelle aucun texte alternatif n'est disponible. Il a été dit que Lynx et Opera faisaient déjà cette distinction. Pour des images qui n'ont pas d'attribut alt, Lynx montre le nom du fichier et Opera le texte "Image", mais aucun des deux ne montre quoi que ce soit pour les images dont l'attribut est laissé vide. Il reste à déterminer si cette disctinction est effectivement utile dans l'affichage du contenu du "monde réel" et il y a réellement un débat à mener si vous avez des preuves à avancer.

On a suggéré que retirer la présence inconditionnelle de l'attribut alt affecterait la capacité des validateurs à montrer leurs erreurs aux utilisateurs et retirerait un bon outil de promotion de l'accessibilité. Cependant, utiliser les erreurs de validation comme un outil d'évangélisation de l'accessibilité n'est certainement une bonne façon d'envisager cette problématique.

Tandis qu'il est en effet très utile pour les auteurs de savoir quand ils ont oublié par erreur un attribut alt en cherchant à les obliger à l'utiliser de façon inconditionnelle, utiliser un outil aussi éculé qu'un validateur est contre-productif, car il encourage l'utilisation de textes générés automatiquement de pauvre qualité. Paralèllement, rien n'empêchera les outils de validation et d'édition web de signaler ces erreurs aux auteurs si tel est leur bon plaisir.

Aucun des bénéfices de l'accessibilité n'est perdu en acceptant le fait qu'il est impossible de forcer tout le monde à fournir un texte alternatif et en rendant l'attribut alt optionnel pour pouvoir s'assurer de la conformité d'un document. Personne ne proclame que la conformité au HTML 5 est équivalente à la conformité avec les recommandations d'accessibilité. Il y a de nombreuses choses qui sont considérés conformes techniquement en HTML, mais qui demeurent "inaccessibles" si elles sont mal utilisées. Rendre l'attribut alt techniquement optionnel ne se dresse pas contre les recommandations d'accessibilité, pas plus qu'il n'a un quelconque impact sur l'évangélisation de l'accessibilité. Il s'agit simplement d'accepter la réalité de la situation à laquelle nous faisons face, dans l'espoir de réduire la prolifération des textes alternatifs de mauvaise qualité générés automatiquement.

Posted in Browsers, Elements, Processing Model, WHATWG | 2 Comments »

Why the Alt Attribute May Be Omitted

Thursday, August 23rd, 2007

The specification of the alt attribute was recently worked on to thoroughly improve its definition, including an in depth explanation of how to provide appropriate alternate text, with clear authoring requirements.

The requirements describe situations where alternate text must be provided, where an empty alt attribute must be used and, most controversially, where the alt attribute may be omitted entirely. This is controversial because at first glance, it seems like an attempt to endorse the bad and inaccessible practice of omitting the alt attribute, and thus yet another slap in the face for accessibility. That is an unfortunate misconception that needs to be carefully examined to settle any concerns people have. Although it may seem backwards, the situation is actually much more positive.

There are many observed cases where alternate text is simply unavailable and there’s little that can be done about it. For example, most users of photo sharing sites like Flickr wouldn’t have a clue how or why to provide alternate text, even if Flickr provided the ability. While everyone agrees that it would be wonderful if all users did – indeed, the spec strongly encourages that – most users simply won’t.

The problem being addressed is what should be done in those cases where no alt text has been provided and is virtually impossible to acquire. With the current requirement for including the alt attribute in HTML4, it has been observed that many systems will attempt to fulfil the requirement by generating alternate text from the images metadata.

Flickr, for example, repeats the images title; Photobucket appears to combine the image’s filename, title and the author’s username; and Wikipedia redundantly repeats the image caption. The problem with these approaches is that using such values does not provide any additional or useful information about the image and, in some cases, this is worse than providing no alternate text at all.

The benefit of requiring the alt attribute to be omitted, rather than simply requiring the empty value, is that it makes a clear distinction between an image that has no alternate text (such as an iconic or graphical representation of the surrounding text) and an image that is a critical part of the content, but for which not alt text is available. It has been claimed that Lynx and Opera already use this distinction. For images without alt attributes Lynx shows the filename and Opera displays "Image", but neither show anything for images with empty alt attributes. It is still somewhat questionable whether this distinction is actually useful and whether or not browsers can realistically make such a distinction with real world content, and that is certainly open to debate if you have further evidence to provide.

It has been suggested that taking away the unconditional requirement for the alt attribute will affect the ability of validators to notify authors of their mistakes and take away a useful tool for promoting accessibility. However, using validation errors as an accessibility evangelism tool is not necessarily the only, nor the best, way to address the issue.

While it is indeed very useful for authors to know when they have mistakenly omitted an alt attribute, attempting to unconditionally enforce their use, using a tool as blunt as a validator, is counter productive since it encourages the use of poor quality, automatically generated text. Besides, nothing will prevent conformance checkers and authoring tools from notifying authors, if they so desire.

No practical accessibility benefits are lost by conceding the fact that you cannot force everyone to provide alternate text and making the alt attribute optional for the purpose of document conformance. No-one is claiming that conformance to HTML5 equates to conformance with accessibility requirements. There are lots of things that are considered technically conforming in HTML, yet still inaccessible if used poorly. Making alt technically optional doesn't stand in the way of accessibility requirements, nor greatly impact upon accessibility evangelism. It just acknowledges the reality of the situation in the hope of reducing the prevalence of poor quality, automatically generated alt text.

Posted in Browsers, Elements, Processing Model | 46 Comments »

Implementation of the HTML5 parsing algorithm in Java

Friday, August 17th, 2007

There is now an open-source implementation of the HTML5 parsing algorithm in Java 5: the Validator.nu HTML Parser. The parser can be used as a drop-in replacement for the XML parser in applications that use SAX, DOM or XOM APIs to read XHTML 1.x content with an XML parser.

Posted in Conformance Checking, Processing Model, Syntax | Comments Off on Implementation of the HTML5 parsing algorithm in Java

Table Integrity Checker

Tuesday, November 14th, 2006

I am working on a conformance checking service for (X)HTML5. The service is grammar-based for the most part with RELAX NG as the schema language. Some extra-grammatical constraints are expressed as Schematron assertions. Currently, as a Mozilla Foundation grantee, I am working on writing checkers (in Java) for spec features that cannot (practically or at all) be checked using RELAX NG or Schematron.

In a Web two-point-ohey perpetual beta fashion, I am deploying the new prototype features early to allow testing.

The first non-schema checker prototype is a table integrity checker. Since the table model for (X)HTML5 is now being specified, the prototype is speculatively based on the HTML 4.01 table model and browser behavior. The differences from HTML 4.01 are that colspan='0' is treated as colspan='1' and that headers must refer to th cells. The top left corner of cells is placed in the first available slot on the row, which is browser-compatible but different from what the CSS2 spec says.

The checker emits both warnings and errors. Depending on how the spec turns out, errors may become warnings or vice versa.

Currently, the errors are:

Table cell is overlapped by later table cell.
Table cell overlaps an earlier table cell. (Single overlap gets reported in both directions to show source location for both cells.)
Table cell spans past the end of its row group.
Row has no cells starting on it.
Table row column count is greater than the column count established by cols/colgroups.
Table row column count is less than the column count established by cols/colgroups.
The headers attribute doesn’t point to th elements in the same table.
Column has no cells starting on it. (Contiguous cell ranges established by a single element are coalesced to a single error to protect against denial of service attacks.)

Currently, the warnings are:

colspan exceeds 1000, which is a magic number in Gecko (and according to comments in Gecko source, in IE and Opera, too)
rowspan exceeds 8190, which is a magic number in Gecko
Table row column count is greater than the column count established by the first row in the absence of cols/colgroups.
Table row column count is less than the column count established by the first row in the absence of cols/colgroups.
A col element causes a span attribute to be ignored on the parent colgroup. (Conforming in HTML 4 / XHTML 1.0; non-conforming in (X)HTML5. With (X)HTML5 there’s also a schema-level error.)

The table integrity checker only sees a projection of the document tree that contains nothing but table-significant elements and crazy subtrees of table-significant elements in wrong places are silently pruned. These are dealt with on the RELAX NG level. The table integrity checker assumes that it is being used together with a reasonable schema.

The table integrity checker is also enabled for the HTML 4.01 / XHTML 1.0 presets on the generic side of the service, so testing with today’s content is possible.

There’s a pseudo-schema called http://hsivonen.iki.fi/checkers/table/ which isn’t a schema but a magic URL that causes the system to instantiate the table integrity checker. There’s a pseudo-pseudo-schema called http://hsivonen.iki.fi/checkers/all/ which expands to all pseudo-schemas, but at the moment, there’s only one.

Please let me know if the table integrity checker does not work as advertised.

Posted in Conformance Checking, Processing Model | Comments Off on Table Integrity Checker