Archive for the ‘Weekly Review’ Category
Friday, February 27th, 2009
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group. The pace of HTML 5 changes has reached a fever pitch, so I'm going to split out these episodes into daily (!) rather than weekly summaries until things calm down.
The big news for February 12, 2009 is the minting of the the spellcheck
attribute, which web authors can use to provide a hint about whether a particular form field expects the sort of input that would benefit from client-side spell checking. r2801 lays it out:
User agents can support the checking of spelling and grammar of
editable text, either in form controls (such as the value of
textarea
elements), or in elements in an editing
host (using contenteditable
).
For each element, user agents must establish a default behavior, either
through defaults or through preferences expressed by the user. There
are three possible default behaviors for each element:
- true-by-default
- The element will be checked for spelling and grammar if its
contents are editable.
- false-by-default
- The element will never be checked for spelling and grammar.
- inherit-by-default
- The element's default behavior is the same as its parent
element's. Elements that have no parent element cannot have this as
their default behavior.
The spellcheck
attribute is an enumerated attribute whose keywords are
true
and false
. The true
keyword map to the true state. The false
keyword maps to the false state. In
addition, there is a third state, the inherit state, which is
the missing value default (and the invalid value
default).
Starting with version 2, Mozilla Firefox has offered built-in spell checking of <textarea>
elements (on by default) and <input type=text>
elements (off by default). You can change the default behavior by setting the spellcheck
attribute. (test case)
The other big news of the day is the addition of the <form autocomplete>
attribute, while lets web authors provide a hint about whether they would like browsers to save the form's contents and pre-fill the form the next time the user encounters it. r2798:
When an input
element's resulting
autocompletion state is on, the user agent
may store the value entered by the user so that if the user returns
to the page, the UA can prefill the form. Otherwise, the user agent
should not remember the control's value.
... A user agent may allow the user to override the resulting
autocompletion state and set it to always on,
always allowing values to be remembered and prefilled), or always off, never remembering values. However, the ability to
override the resulting autocompletion state to on should not be trivially accessible, as there are
significant security implications for the user if all values are
always remembered, regardless of the site's preferences.
<form autocomplete>
is commonly used on sensitive login forms where the site does not want users to be able to store their password in their browser (which is generally done in an insecure way). Most browsers honor these hints by default, although there are ways to override them if you dislike the idea of web authors disabling useful bits of your browser's functionality.
Other interesting changes of the day:
- r2802 allows external Javascript files to contain a BOM to facilitate identifying scripts in non-ASCII-compatible character encodings.
- r2796 adds some examples of using the unloved
<small>
element.
Discussion of the day: Gregory J. Rosmaita gives details on report of PFWG HTML5 actions ("PFWG" = Protocols and Formats Working Group). The original post was about accessibility issues, specifically a response to the <image alt>
attribute becoming optional and the omission of the headers
and summary
attributes in the HTML 5 table model. But the thread was quickly hijacked by a discussion of the fact that the W3C published another working draft of HTML 5 on February 12.
Wait... what? Oh yes, in true "burying the lede" fashion, I suppose I should mention that the biggest news of February 12th is that the W3C published another working draft of HTML 5. Except that readers of this series will find it uninteresting, since it's just a snapshot of the progress-to-date. (The spec is "published" on whatwg.org every time it changes anyway.) Working drafts have no formal status; they are merely intended to encourage early and wide review. Still, the rest of the world might think it's important, so be sure to bring it up at this weekend's cocktail parties.
Tune in... well, sometime soon-ish for another exciting episode of "This Week Day In HTML 5."
Posted in Weekly Review | 2 Comments »
Wednesday, February 25th, 2009
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group. The pace of HTML 5 changes has reached a fever pitch, so I'm going to split out these episodes into daily (!) rather than weekly summaries until things calm down.
The big news for February 11, 2009 is the addition of an algorithm to parse a color in an IE-compatible way. r2776 lays it all out:
Some obsolete legacy attributes parse colors in a more
complicated manner, using the rules for parsing a legacy color
value, which are given in the following algorithm. When
invoked, the steps must be followed in the order given, aborting at
the first step that returns a value. This algorithm will either
return a simple color or an error.
Let input be the string being
parsed.
If input is the empty string, then
return an error.
If input is an ASCII
case-insensitive match for the string "transparent
", then return an error.
If input is an ASCII
case-insensitive match for one of the keywords listed in the
SVG color
keywords or CSS2 System
Colors sections of the CSS3 Color specification, then return
the simple color corresponding to that keyword. [CSS3COLOR]
-
If input is four characters long, and the
first character in input is a U+0023 NUMBER
SIGN (#) character, and the the last three characters of input are all in the range U+0030 DIGIT ZERO (0)
.. U+0039 DIGIT NINE (9), U+0041 LATIN CAPITAL LETTER A .. U+0046
LATIN CAPITAL LETTER F, and U+0061 LATIN SMALL LETTER A .. U+0066
LATIN SMALL LETTER F, then run these substeps:
Let result be a simple
color.
Interpret the second character of input as a hexadecimal digit; let the red
component of result be the resulting number
multiplied by 17.
Interpret the third character of input
as a hexadecimal digit; let the green component of result be the resulting number multiplied by
17.
Interpret the fourth character of input as a hexadecimal digit; let the blue
component of result be the resulting number
multiplied by 17.
Return result.
Replace any characters in input that
have a Unicode codepoint greater than U+FFFF (i.e. any characters
that are not in the basic multilingual plane) with the
two-character string "00
".
If input is longer than 128 characters,
truncate input, leaving only the first 128
characters.
If the first character in input is a
U+0023 NUMBER SIGN character (#), remove it.
Replace any character in input that is
not in the range U+0030 DIGIT ZERO (0) .. U+0039 DIGIT NINE (9),
U+0041 LATIN CAPITAL LETTER A .. U+0046 LATIN CAPITAL LETTER F, and
U+0061 LATIN SMALL LETTER A .. U+0066 LATIN SMALL LETTER F with the
character U+0030 DIGIT ZERO (0).
While input's length is zero or not a
multiple of three, append a U+0030 DIGIT ZERO (0) character to input.
Split input into three strings of equal
length, to obtain three components. Let length
be the length of those components (one third the length of input).
If length is greater than 8, then remove
the leading length-8 characters in
each component, and let length be 8.
While length is greater than two and the
first character in each component is a U+0030 DIGIT ZERO (0)
character, remove that character and reduce length by one.
If length is still greater than
two, truncate each component, leaving only the first two
characters in each.
Let result be a simple
color.
Interpret the first component as a hexadecimal number; let
the red component of result be the resulting
number.
Interpret the second component as a hexadecimal number; let
the green component of result be the resulting
number.
Interpret the third component as a hexadecimal number; let
the blue component of result be the resulting
number.
Return result.
Information on exactly which attributes are subject to this algorithm is scattered throughout the spec. Here is the complete list:
<font color>
<frame bordercolor>
<frameset bordercolor>
<hr color>
<table bgcolor>
<thead bgcolor>
<tfoot bgcolor>
<tbody bgcolor>
<tr bgcolor>
<td bgcolor>
<th bgcolor>
<body text>
<body link>
<body vlink>
<body alink>
<body bgcolor>
The other big news today is the addition of a section on matching HTML elements using selectors. Some of these (:link
, :visited
, :active
) will be familiar to anyone who has written a CSS stylesheet, but there are a number of new selectors that correspond to concepts introduced in HTML 5.
:link
and :visited
match hyperlinks (<a>
, <area>
, and <link>
elements with an href
attribute).
:active
matches certain elements while they are being activated, like a button between mousedown
and mouseup
(or keydown
and keyup
)
:enabled
and :disabled
match hyperlinks and certain other elements that can be disabled, like form fields
:checked
matches checkboxes and radio buttons
:indeterminate
matches checkboxes in the indeterminate state
:default
matches default buttons in forms
:valid
and :invalid
match form fields that have constraints
:in-range
and :out-of-range
match form fields that have range-based constraints (i.e. they can either overflow or underflow)
:required
and :optional
match certain form fields
:read-write
matches editable form fields and other editable elements, and :read-only
matches any element that is not read-write
Other interesting changes of the day:
Discussion of the day: What's the problem? "Reuse of 1998 XHTML namespace is potentially misleading/wrong". Take it away, Lachlan:
I believe the issue is that the XHTML2 WG think they have change
control over that namespace URI and that we shouldn't be using it.
Additionally, the latest XHTML 2 editor's draft is now using the
namespace.
This issue has been discussed in depth around mid 2007. The problem
is that XHTML5 and XHTML2 are completely incompatible with each other
and they cannot possibly use the same namespace as each other.
But XHTML2 also has several major incompatibilities with XHTML1, which
would effectively make it impossible to implement both XHTML 1.x and 2
in the same implementation, if they share the same namespace. XHTML
5, on the other hand, has not only been designed with compatibility in
mind, success is dependent upon continuing to use the same namespace.
Basically, the only solution to this issue that should be considered is
that we continue using the namespace and the XHTML2 WG use a different
namespace.
I'm sure that will go over well with the 12 people who are still working on XHTML 2.
Tune in... well, sometime soon-ish for another exciting episode of "This Week Day In HTML 5."
Posted in Weekly Review | 6 Comments »
Tuesday, February 10th, 2009
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.
The big news this week is more major work on the non-normative section on rendering HTML documents, including a lot of reverse-engineered documentation of legacy (invalid) attributes that users expect browsers to support.
- r2749:
marginwidth
and marginheight
attributes on the <body>
element
- r2750:
hspace
and vspace
attributes on <table>
- r2751: the
bgcolor
attribute
- r2752: the
<font>
element
- r2753: the
frames
and rules
attributes of <table>
- r2757: embedded content such as
<audio>
, <video>
, <embed>
, <iframe>
, and <canvas>
- r2759: laying out a group of
<frame>
s within a <frameset>
- r2760: the
<br>
element
- r2761: default margins on
<h1>
, <h2>
, <h3>
, <h4>
, <h5>
, <h6>
, and <figure>
- r2762:
<bb>
, <button>
, and <details>
elements
- r2763: the
<hr>
element (this change in particular has some WHATWG members very excited)
- r2764: the
<fieldset>
element
- r2765:
<input type=text>
- r2766:
<input type=date>
, <input type=range>
, and <input type=color>
- r2767:
<input type=checkbox>
, <input type=radio>
, <input type=file>
, <input type=submit>
, <input type=reset>
, and <input type=button>
- r2768:
<select>
, <progress>
, and <meter>
- r2769:
<textarea>
- r2770:
<mark>
- r2772: printing HTML documents
- r2773:
<link>
elements
In addition, one major section was dropped from HTML 5 this week: an algorithm for determining what object is under the cursor (presuming, of course, that the cursor is within the region of the screen which contains an HTML document, and the current context has a screen, and the current context has a cursor). Ian Hickson has announced on www-style that, in accordance with that group's consensus, the algorithm would be better maintained in a future CSS specification.
Around the web:
Tune in next week for another exciting episode of "This Week in HTML 5."
Posted in Weekly Review | 4 Comments »
Tuesday, February 3rd, 2009
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group.
The big news this week is the beginning of the non-normative section on rendering HTML documents. For those of you not up on spec-writing lingo, "non-normative" means "you can ignore this and still claim to be in compliance with the specification." It's advice, not commands. On the other hand, it's generally useful advice, so ignoring it completely is probably not in your best interests.
Currently, the rendering section includes advice on
- Hidden elements. Things like
<script>
should always be hidden (in the sense that they should be executed, not have their source displayed in the page). Likewise, <meta>
, <link>
, <style>
, and so on.
- Display types. Which elements should be rendered as block-level elements, which as tables, which as list items, and so on.
- Margins and padding. Default values for different elements, and also for the same element in different contexts (nested within other elements).
- Alignment. Table headers and captions are centered by default;
<table align=left>
is treated like float:left
; etc.
- Fonts and colors. By default, links are blue, visited links are purple, and
<code>
is rendered in a monospace font.
- Punctuation and decorations. Links are underlined by default, acronyms are dotted-underlined, and
<blink>
, well, blinks.
- Resetting rules for inherited properties. Tables reset certain text properties; in quirks mode, they reset even more.
Scrolling through the rest of the (mostly empty) rendering section shows lots of potential for future advice on form controls, data grids, favicons, and even the <marquee>
element.
Rendering-related revisions: r2734, r2735, r2736, r2737, r2738.
Switching back to the normative parts of the spec, we have r2720, which makes the outerHTML
property and the insertAdjacentHTML()
method work in XHTML. For the purposes of this discussion — indeed, for the purposes of the entire HTML 5 specification — "XHTML" means "content served with a Content-Type: application/xhtml+xml
". In addition, the section The XHTML Syntax has been entirely reorganized and rewritten to consolidate the rules for parsing and serializing XHTML documents and fragments. [Background: Re: outerHTML/insertAdjacentHTML in XML mode]
Other interesting tidbits this week:
- r2712 mandates that browsers ignore any extraneous text on the first line of an application cache manifest file (after the file signature "CACHE MANIFEST"), to accomodate hard-core web authors who edit their manifest files manually in Emacs and want to include mode lines on the first line of the file.
- r2719 specifies that browsers should not allow scripts to set
document.domain
to anything on the Public Suffix List, such as "com" or "co.jp". Essential background reading on why this is dangerous: Untraceable XSS Attacks. Most browsers already block this attack, e.g. Firefox since 3.0. [Background: Re: Setting document.domain]
- r2711 addresses some security issues surrounding scripts that open windows with an address of
about:blank
.
- r2731 requires that floats be serialized using exponential notation, e.g.
1e+0
. [Background: Floating point number feedback]
- r2725 is another in a long and mostly boring saga surrounding the concept of a "legacy DOCTYPE." The official DOCTYPE of HTML 5 is simply
<!DOCTYPE HTML>
-- so simple, in fact, that some tools can not generate it. Bug 54 tracks the issue to the point of obsession; I won't go into details here, but the issue has been bounced around since at least June 2008. I doubt this will be the last we hear about legacy DOCTYPEs. [More background: ISSUE-54: <!DOCTYPE HTML SYSTEM "about:legacy-compat">]
Tune in next week for another exciting episode of "This Week in HTML 5."
Posted in Weekly Review | 3 Comments »
Friday, January 30th, 2009
Welcome back to "This Week in HTML 5," where I'll try to summarize the major activity in the ongoing standards process in the WHATWG and W3C HTML Working Group. Despite it being almost February already, this episode will focus on changes and discussion from the week of January 19th. Normal weekly updates will resume on Monday.
There are 3 pieces of big news for the week of January 19th. Big news #1: r2692, a major revamp of the way application caches are defined. Application caches are the heart of the offline web model which can be used to allow script-heavy web applications like Gmail to work even after you disconnect from the internet. Here is the new definition of how application caches work:
Each application cache has a completeness flag, which is either complete or incomplete.
An application cache group is a group of application caches, identified by the absolute URL of a resource manifest which is used to populate the caches in the group.
An application cache is newer than another if it was created after the other (in other words, application caches in an application cache group have a chronological order).
Only the newest application cache in an application cache group can have its completeness flag set to incomplete, the others are always all complete.
Each application cache group has an update status, which is one of the following: idle, checking, downloading.
A relevant application cache is an application cache that is the newest in its group to be complete.
Each application cache group has a list of pending master entries. Each entry in this list consists of a resource and a corresponding Document
object. It is used during the update process to ensure that new master entries are cached.
An application cache group can be marked as obsolete, meaning that it must be ignored when looking at what application cache groups exist.
A Document
initially is not associated with an application cache, but steps in the parser and in the navigation sections cause cache selection to occur early in the page load process.
Multiple application caches in different application cache groups can contain the same resource, e.g. if the manifests all reference that resource.
The end result of this major work is actually pretty similar to how application caches worked before, but there were some edge cases (such as handling 404 errors when fetching the application manifest) which are now handled in a sane fashion. It also paved the way for r2693, which makes it possible for application caches to become "obsolete" (meaning they must be ignored when deciding which caches exist).
Big news #2: r2684, which redefines the on*
attributes in a way that doesn't suck quite as much. Also, it defines the widely used (but poorly understood) onerror
attribute in a way that matches what browsers actually do with it. Here is the meat of it:
All event handler attributes on an element, whether set to null
or to a Function
object, must be registered as event listeners on the
element, as if the addEventListenerNS()
method on the Element
object's EventTarget
interface had been invoked when the event handler attribute's
element or object was created, with the event type (type argument) equal to the type
described for the event handler attribute in the list above, the
namespace (namespaceURI
argument) set to null, the listener set to be a target and bubbling
phase listener (useCapture
argument set to false), the event group set to the default group
(evtGroup argument set to
null), and the event listener itself (listener argument) set to do
nothing while the event handler attribute's value is not a
Function
object, and set to invoke the call()
callback of the
Function
object associated with the event handler
attribute otherwise.
The listener
argument is emphatically not the event handler attribute
itself.
When an event handler attribute's Function
objectw
is invoked, its call()
callback must be invoked with one argument, set to the
Event
object of the event in question.
The handler's return value must then be processed as follows:
- If the event type is
mouseover
If the return value is a boolean with the value true, then
the event must be canceled.
- If the event object is a
BeforeUnloadEvent
object
If the return value is a string, and the event object's
returnValue
attribute's value is the empty string, then set the returnValue
attribute's value to the return value.
- Otherwise
If the return value is a boolean with the value false, then
the event must be canceled.
The Function
interface represents a function in the
scripting language being used. It is represented in IDL as
follows:
[Callback=FunctionOnly, NoInterfaceObject]
interface Function {
any call([Variadic] in any arguments);
};
The call(...)
method is the object's callback.
In JavaScript, any Function
object implements this interface.
Big news #3: r2685 and r2686 defines a whole slew of important events that are fired on the Window
object, including onbeforeunload
, onerror
, and onload
. Previously, some of these were defined on the <body>
element, which didn't actually match current browser behavior.
The following are the event handler attributes that must be
supported by Window
objects, as DOM attributes on the
Window
object, and with corresponding content
attributes and DOM attributes exposed on the body
element:
onbeforeunload
Must be invoked whenever a beforeunload
event is targeted at or bubbles
through the element or object.
onerror
-
Must be invoked whenever an error
event is targeted at or bubbles
through the object.
Unlike other event handler attributes, the onerror
event handler attribute can
have any value. The initial value of onerror
must be
undefined
.
The onerror
handler is also used for reporting script errors.
onhashchange
Must be invoked whenever a hashchange
event is targeted at or bubbles
through the object.
onload
Must be invoked whenever a load
event is targeted at or bubbles
through the object.
onmessage
Must be invoked whenever a message
event is targeted at or bubbles
through the object.
onoffline
Must be invoked whenever a offline
event is targeted at or bubbles
through the object.
ononline
Must be invoked whenever a online
event is targeted at or bubbles
through the object.
onresize
Must be invoked whenever a resize
event is targeted at or bubbles
through the object.
onstorage
Must be invoked whenever a storage
event is targeted at or bubbles
through the object.
onunload
Must be invoked whenever an unload
event is targeted at or bubbles
through the object.
Other interesting tidbits from the week of January 19th:
- r2683 defines the concept of an override URL in order to prevent
javascript:
URLs (which you should never, ever use) from breaking through the cross-domain origin security policy.
- r2697 provides an algorithm for determining the character encoding of an external script referenced by a
<script>
element.
- r2698 clarifies that
rel
attributes are case-insensitive.
- r2703 tweaks the parsing algorithm of the misplaced
<frameset>
elements to be more compatible with Internet Explorer.
Tune in next week for another exciting episode of "This Week in HTML 5."
Posted in Weekly Review | 3 Comments »