The WHATWG Blog

Archive for the ‘Forms’ Category

Newline normalizations in form submission

Thursday, May 27th, 2021

If you work with form submissions, you might have noticed that form values containing newlines are normalized to CRLF, no matter whether the DOM value had LF or CR instead:

<form action="./post" method="post" enctype="application/x-www-form-urlencoded">
  <input type="hidden" name="hidden" value="a&#x0D;b&#x0A;c&#x0D;&#x0A;d" />
  <input type="submit" />
</form>

<script>
  // Checking that the DOM has the correct newlines and the normalization
  // happens during the form submission.
  const hiddenInput = document.querySelector("input[type=hidden]");
  console.log("%s", JSON.stringify(hiddenInput.value)); // "a\rb\nc\r\nd"
</script>

hidden=a%0D%0Ab%0D%0Ac%0D%0Ad

But although it might seem simple on the surface, newline normalization in form submissions is a topic that runs deeper than I thought, with bugs in the spec and differences across browsers. This post goes through what the spec used to do, what browsers (used to) implement, and how we went about fixing it.

First, some background on form submission

The data to submit from a form is modeled as an entry list – entries being pairs of names (strings) and values (either strings or File objects). This is a list rather than a map because a form can have multiple values for each name – which is is how <input type="file" multiple> and <select multiple> work – and their relative order with the rest of form entries matters.

The algorithm that does the job of going through every submittable element associated with a particular form and collecting their corresponding form entries is the "construct the entry list" algorithm. This algorithm does what you'd expect – discard disabled controls and buttons not pressed, and then for each control it calls the "append an entry" algorithm, which used to replace any newlines in the entry name and value (if that value is a string) with CRLF before appending the entry.

"Construct the entry list" is called early into the form submission algorithm, and the resulting entry list is then passed to the encoding algorithm for the corresponding enctype. Since only the multipart/form-data enctype supports uploading files, the algorithms for both application/x-www-form-urlencoded and text/plain encode the value's filename instead.

First signs of trouble

My first foray into the encoding of form payloads was in defining precisely how entry names (and filenames of file entry values) had to be escaped in multipart/form-data payloads, and since LF and CR have to be percent escaped, newlines came up during testing.

One thing I noticed is that, if you have newlines inside a filename – yes, that is something you can do – they're normalized differently than for an entry name or a string value.

<form id="form" action="./post" method="post" enctype="multipart/form-data">
  <input type="hidden" name="hidden a&#x0D;b" value="a&#x0D;b" />
  <input id="fileControl" type="file" name="file a&#x0D;b" />
</form>

<script>
  // A file with filename "a\rb", empty contents, and "application/octet-stream"
  // MIME type.
  const file = new File([], "a\rb");

  const dataTransfer = new DataTransfer();
  dataTransfer.items.add(file);
  document.getElementById("fileControl").files = dataTransfer.files;

  document.getElementById("form").submit();
</script>

Here is the resulting multipart/form-data payload in Chrome and Safari (newlines are always CRLF):

------WebKitFormBoundaryjlUA0jn3NUYxIh2A
Content-Disposition: form-data; name="hidden a%0D%0Ab"

a
b
------WebKitFormBoundaryjlUA0jn3NUYxIh2A
Content-Disposition: form-data; name="file a%0D%0Ab"; filename="a%0Db"
Content-Type: application/octet-stream


------WebKitFormBoundaryjlUA0jn3NUYxIh2A--

And this is in Firefox 88 (the current stable version as of this writing):

-----------------------------26303884030012461673680556885
Content-Disposition: form-data; name="hidden a b"

a
b
-----------------------------26303884030012461673680556885
Content-Disposition: form-data; name="file a b"; filename="a b"
Content-Type: application/octet-stream


-----------------------------26303884030012461673680556885--

As you can see, Firefox substitutes a space for any newlines (CR, LF or CRLF) in the multipart/form-data encoding of entry names and filenames, rather than percent-encoding them as do Chrome and Safari. This behavior was made illegal in the spec in pull request #6282, but it couldn't be fixed in Firefox until the spec decided on a normalization behavior. In the case of values, Firefox normalizes to CRLF as the other browsers do.

As for Chrome and Safari, here we see that newlines in entry names and string values are normalized to CRLF, but filenames are not normalized. From the entry list construction algorithm as described above, this makes sense because entry values are only normalized to CRLF when they are strings – files are unchanged, and so are their filenames.

Except that, if you change the form's enctype in the above example to application/x-www-form-urlencoded, you get this in every browser:

hidden+a%0D%0Ab=a%0D%0Ab&file+a%0D%0Ab=a%0D%0Ab

Since multipart/form-data is the only enctype that allows file uploads, other enctypes use their filenames instead. But here it seems like every browser is CRLF-normalizing the filenames, even though in the spec that substitution happens long after constructing the entry list.

Normalizations with `FormData` and `fetch`

The FormData class started out as a way to send multipart/form-data form payloads through the XMLHttpRequest and fetch APIs without having to generate that payload in JavaScript. As such, FormData instances are basically a JS-accessible wrapper over an entry list.

So let's try the same with FormData:

const formData = new FormData();
formData.append("hidden a\rb", "a\rb");
formData.append("file a\rb", new File([], "a\rb"));

// FormData objects in fetch will always be serialized as multipart/form-data.
await fetch("./post", { method: "POST", body: formData });

Safari sends the same form payload as above, with names and values normalized to CRLF, and so does Firefox 88 with values normalized to CRLF (and names and values having their newlines escaped as spaces). But Chrome keeps names, filenames and values unnormalized (here the ? character stands for CR):

------WebKitFormBoundarySMGkMfD8mVOnmGDP
Content-Disposition: form-data; name="hidden a%0Db"

a?b
------WebKitFormBoundarySMGkMfD8mVOnmGDP
Content-Disposition: form-data; name="file a%0Db"; filename="a%0Db"
Content-Type: application/octet-stream


------WebKitFormBoundarySMGkMfD8mVOnmGDP--

Since FormData is just a wrapper over an entry list, and fetch simply calls the multipart/form-data encoding algorithm, no normalizations should take place. So it looks like Chrome was following the spec here, while Firefox and Safari were apparently doing some newline normalization (for Firefox, on string values only) at the time of serializing as multipart/form-data.

With FormData you can also investigate what the "construct the entry list" algorithm does, since if you pass a <form> element to the FormData constructor, it will call that algorithm outside of a form submission context, and let you inspect the resulting entry list.

<form id="form">
  <input type="hidden" name="a&#x0D;b" value="a&#x0D;b" />
</form>

<script>
  const formData = new FormData(document.getElementById("form"));
  for (const [name, value] of formData.entries()) {
    console.log("%s %s", JSON.stringify(name), JSON.stringify(value));
  }
  // Firefox and Safari print: "a\rb" "a\rb"
  // Chrome prints: "a\r\nb" "a\r\nb"
  // These results don't depend on the form's enctype.
</script>

So it seems like Firefox and Safari are not normalizing as they construct the entry list, and instead normalize names and values at the time that they encode the form into an enctype. In particular, since the application/x-www-form-urlencoded and text/plain enctypes don't allow file uploads, file entry values are substituted with their filenames before the normalization. Entry lists that aren't created from the "construct an entry list" algorithm get normalized all the same.

Chrome instead follows the specification (as it used to be) in normalizing in "construct an entry list" and not normalizing later, even for entry lists created through other means. But that doesn't explain why filenames in the application/x-www-form-urlencoded and text/plain enctypes are normalized. Does Chrome also have an additional normalization layer?

Investigating late normalization with the `formdata` event

It would be great to investigate in more detail what Chrome and other browsers do after constructing the entry list. Since the entry list construction already normalizes entries, any further normalizations that might happen further down the line are obscured in the common case.

In the case of multipart/form-data, we can test this because using a FormData object with fetch doesn't invoke "construct an entry list", and so can see what happens to unnormalized entries. For other enctypes there is no way to create an entry list that doesn't go through "construct an entry list", but as it turns out, the "construct an entry list" algorithm itself offers two ways to end up with unnormalized entries: form-associated custom elements (only implemented in Chrome so far) and the formdata event (implemented in Chrome and Firefox). Here we'll only be covering the latter, since their results are equivalent.

One thing I skipped when I covered the "construct an entry list" algorithm above is that, at the end of the algorithm, after all entries corresponding to controls have been added to the entry list, a formdata event is fired on the relevant <form> element. This event has a formData attribute which allows you not only to inspect the entry list at that point, but to modify it.

<form
  id="form"
  action="./post"
  method="post"
  enctype="application/x-www-form-urlencoded"
>
  <!-- Empty -->
</form>

<script>
  const form = document.getElementById("form");
  form.addEventListener("formdata", (evt) => {
    evt.formData.append("string a\rb", "a\rb");
    evt.formData.append("file a\rb", new File([], "a\rb"));
  });
  form.submit();
</script>

For both Chrome and Firefox (not Safari because it doesn't support the formdata event), trying this with the application/x-www-form-urlencoded enctype gets you a normalized result:

string+a%0D%0Ab=a%0D%0Ab&file+a%0D%0Ab=a%0D%0Ab

Firefox shows the same normalizations for the text/plain enctype; Chrome instead normalizes only filenames, not names and values. And with multipart/form-data we get the same result as with fetch and FormData above: Chrome doesn't normalize anything, Firefox normalizes string values (with names and filenames being replaced with spaces).

So in short:

For application/x-www-form-urlencoded, all browsers perform an additional newline normalization at the moment of serializing the form payload, whether or not they normalize when constructing the entry list. Note that newline normalizations are idempotent, so normalizing an already normalized string doesn't change it.
For text/plain, Firefox and Safari seem to act just like for application/x-www-form-urlencoded. Chrome instead only normalizes filenames, probably at the same time as files are being substituted with their filenames.
For multipart/form-data, Chrome doesn't normalize anything. Safari instead normalizes names and string values, but not filenames. Firefox does the same as Safari for values, but replaces any newlines in names and filenames with a space.

Remember that these differences across browsers don't really affect the encoding of normal forms, they only matter if you're using FormData with fetch, the formdata event, or form-associated custom elements.

Fixing the spec

So which behavior do we choose? Firefox replacing newlines in multipart/form-data names and values with a space is illegal as per PR #6282, but anything else is fair game.

For text/plain, we have Firefox and Safari behaving in the same way, and Chrome disagreeing. Since text/plain cannot represent inputs unambiguously, is little used in any case, and you would need either form-associated custom elements or to use the formdata event to see a difference, it seems extremely unlikely that there is web content that depends on either case. So it makes more sense to treat text/plain just like application/x-www-form-urlencoded and normalize names, filenames and values.

For multipart/form-data, there is the added compatibility risk that you can observe this case by using FormData and fetch, so it's more likely to cause webcompat issues, no matter whether we went with Safari's or Chrome's behavior. In the end we choose to go with Safari's, in order to be consistent at normalizing all enctypes – although the multipart/form-data has to be different in not normalizing filenames, of course.

So in pull request #6287 we fixed this by:

Adding a new algorithm that runs before the application/x-www-form-urlencoded and text/plain serializers. This algorithm first extracts a filename from the entry value, if it's a file, and then CRLF-normalizes both the name and value.
We changed the multipart/form-data encoding algorithm to have a first step that CRLF-normalizes names and string values, leaving file values intact.

Do we need the early normalization though?

At the time that we decided the above changes to the spec, I still thought that the spec's (and Chrome's) behavior of normalizing in "construct the entry list" was correct. But later on I realized that, once we have the late normalizations mentioned above, the early normalization in "construct the entry list" doesn't matter for form submission, since the late normalizations do everything the early one can do and more. The only way you could observe whether that early normalization is present or not is through the FormData constructor. So it would make sense to remove that early normalization and standardize on Firefox and Safari's behavior here, as we did in pull request #6624.

One kink remained, though: <textarea> elements support the wrap="hard" attribute, which adds linebreaks into the submitted value corresponding to how the text is linewrapped in the UI. In the spec, this is done through the "textarea wrapping transformation", which takes the textarea's "raw value", normalizes it to CRLF, and when wrap="hard", adds CRLF newlines to wrap the contents. But if you test this on Safari, all newlines (both normalized and added) are LF – and Firefox currently doesn't implement wrap="hard", but it does normalize newlines to LF. So should this be changed?

I thought it was better to align on Firefox's and Safari's behavior, especially since this could simplify the mess that is the difference between "raw value", "API value" and "value" for <textarea> in the spec. Chrome disagreed at first – but as it turns out, Chrome's implementation of the textarea wrapping transformation normalizes to LF, and it's only the normalization in "construct an entry list" that normalizes those newlines to CRLF. So Chrome would align with Firefox and Safari in that area by just removing the early normalization.

Pull request #6697 fixes this issue with <textarea> in the spec, and the follow-up issue #6662 will take care of simplifying the textarea wrapping transformation in the spec, as well as ensure that it matches implementations.

Implementation fixes

After those three pull requests, the current spec mandates that no normalization happen in "construct the entry list", and that the entry lists to be encoded with some enctype go through a transformation that depends on the enctype:

For application/x-www-form-urlencoded and text/plain, this transformation first replaces file values with their filenames, and then CRLF-normalizes names and values.
For multipart/form-data, this transformation CRLF-normalizes names and string values, but not filenames.

Safari implements the current spec behavior, and so doesn't need any fixes as a result of these spec changes. However, when working on them I noticed that Safari had a preexisting bug in that it wasn't converting entry names and string values into scalar value strings in the "construct an entry list" algorithm, which could lead to FormData objects containing DOMString values despite the WebIDL declaration. What's more, those lone surrogates would remain there until the time of serializing the form, and might show up in the form payload as WTF-8 surrogate byte sequences. This is bug 225299.

Firefox acts much like Safari, except in that it escapes newlines in multipart/form-data names and filenames as spaces rather than CRLF-normalizing in the case of names and then percent-encoding. With the normalization now defined in the spec, this could now be changed. This is bug 1686765. I additionally also found the same bug as Safari with not converting strings into scalar value strings, except in this case the normalization did take place when encoding (bug 1709066). Both issues are now fixed in Firefox Nightly and will ship on Firefox 90.

Finally, Chrome would need to remove the normalization it performs in "construct an entry list", update the normalization it performs on the text/plain enctype to cover not only filenames but also names and values, and add an additional normalization for multipart/form-data names and values. This is covered in issue 1167095. Since Chrome is the browser that needs the most changes as a result of these spec changes, and they are understandably concerned with backwards compatibility, these fixes are expected to ship behind a flag so they can be quickly rolled back in case of trouble.

Conclusion

Although form submission is not necessarily seen as an easy topic, newline normalization in that context might not seem like such a big deal from the outside. But in a platform like the web, where spec and implementation decisions can easily snowball into compatibility concerns, even things that might look simple might have a history that makes standardizing on one behavior hard.

Posted in Browsers, Forms, WHATWG | Comments Off on Newline normalizations in form submission

Focusing on focus

Wednesday, October 16th, 2019

Focus behavior in HTML had been under-specified for the past few years, and it was also quite confusing due to a variety of subtle differences between focusing methods, UA-specific behaviors, relation to the tabindex attribute, relations to shadow DOM, etc.

A few months ago Domenic filed a meta-bug that contains a list of things to fix so that we would have a good foundation for further additions to focus-related stuff, and to hopefully make focus in HTML make sense to browser engineers and web authors!

Types of focus

You may know that you can focus on stuff by clicking on them, tabbing, or calling focus() on it — but did you know that things might be focusable with one method but not with others? Domenic and Mu-An made a giant interactive list of HTML elements to showcase how they react to different methods of focusing, and can be tested from various browsers to show the difference between them.

To reflect the varying behaviors of focus in various UAs in the spec, we classified focusability into three types: “programmatically focusable”, “click focusable”, and “sequentially focusable”.

If an element is programmatically focusable, it will get focused when calling the focus() method on it or when putting an autofocus attribute on the element. In all platforms, all elements that are either click focusable or sequentially focusable are also programmatically focusable, so “programmatically focusable” is interchangeable with “focusable”.

If an element is click focusable, the element will get focused when it’s clicked. This has the same set of elements as “programmatically focusable” in most UAs/platforms. A notable exception is Safari where non-editable form controls (checkboxes, etc.) are not click focusable by default.

If an element is sequentially focusable, the element can be focused through “tabbing” — in most UAs this means pressing Tab/Shift+Tab (and in Safari, Option+Tab too!).

Previously the spec didn’t clearly differentiate “programmatically focusable” and “sequentially focusable”, didn’t even mention “click focusable”, and used the “tabindex focus flag” concept which was slightly confusing due to its relations with the tabindex attribute. So we updated it in this PR.

`tabindex`

As you may know already, built-in elements like <button>, <input>, <a>, etc. all have a “default” focus behavior — they are focusable (or not) by default. When the tabIndex property getter is run on an element whose tabindex attribute is not explicitly set already, it will return 0 sometimes and -1 other times. Previously, the spec said to return 0 if the element is “focusable” by default (which type of focusable?), and -1 otherwise. But this wasn’t implemented anywhere, because of possible differences in which elements are focusable by default in various UAs. So now the spec actually checks if the tag name is one of the tag names included in a pre-defined list. This is quite awkward, but at least it’s interoperable! (PR is here.)

Now, what is the use of tabindex exactly? You can make an element focusable by setting its tabindex attribute to an integer. This will set the tabindex value, and thus impact the focusability of the element. If the integer is non-negative, the element is also sequentially focusable. You can’t, however, make an element not focusable at all through this attribute — there is no value you can set the tabindex attribute to on a <button> that will stop it from being focusable.

You can also modify the order of elements traversed with sequential navigation/tabbing — elements with a positive tabindex value will be traversed first, in ascending order (and in tree order in case of a tie), and then elements with a tabindex value of zero (or unspecified but the element is sequentially focusable by default), in tree order.

`autofocus`

The autofocus content attribute is useful if you want to set focus on a form control element on page load. However, it had some issues such as:

The autofocus attribute was available only for some form control elements, and unavailable for other focusable elements like <summary>, <div contenteditable="true">, and <span tabindex="0">.
The behavior with an autofocused element in a page accessed with a fragment identifier (e.g. https://example.com/#foo) was not interoperable.
What should happen if there are multiple autofocused elements, and one of them is not focusable?

The HTML specification and the SVG specification were updated to resolve these issues by changes 1, 2, and 3. Now the autofocus content attribute and the autofocus IDL attribute are available on all HTML and SVG elements. The autofocus content attribute is not processed if its document has a fragment identifier, or has a focused element. If an autofocus element is not focusable, the element is skipped and another autofocus element is handled.

Shadow DOM and `delegatesFocus`

All of the previously mentioned concepts were already specced in the HTML spec in some way, albeit a bit unclear, not reflecting the actual behavior, etc. Focus behavior with shadow DOM, though, had not been upstreamed from the old Shadow DOM spec at all, so this part of the effort took the most time overall. Since some parts of the Shadow DOM spec on focus were unclear and needed more explanation, we also took a look at how it’s implemented in Blink to get the exact behavior down, and discussed with other browser vendors and web developers on whether we want to keep the implemented behavior or not in some cases.

Sequential focus navigation got some significant additions (PR) with Shadow DOM. We now have a concept of “tabindex-ordered focus navigation scope”, “focus navigation scope owners”, etc. Essentially, elements are put in different focus navigation scopes, where those that belong to the same shadow tree/root, or are slotted to the same slot, are put in the same focus navigation scope. The tabindex order explained earlier now only applies to elements in the same scope, and then finally we flatten all of the scopes to get the final sequential focus navigation order.

A new concept added (PR 1, 2) by Shadow DOM is “delegates focus”, which is used when you want attempts to focus on a shadow host to not focus on the host itself, and instead delegate the focus to within the shadow tree (like <input type="date">!). In Blink, this delegation uses the sequential focus navigation order, but we think it is a bit weird and started a discussion on how this should actually work — finally changing the delegation to respect the protocol of whatever the focusing method is originally used. (That is, use sequential order if we used tabbing, otherwise use flat-tree order and respect click focusability if needed.)

We also added the DocumentOrShadowRoot’s activeElement property (PR). And we updated the :focus selector (PR), which will now match on shadow hosts if the focused element is a shadow-including inclusive descendant of it. (This is actually different than the behavior in Blink, and is a result of discussion in TPAC — where we also discussed “delegates focus”.)

The future, and outro

Now that we have a good-enough spec for focus, we have a good foundation for future additions to focus. One new relevant proposal is “custom element default focusability”. As we’ve mentioned, built-in elements have a “default” focus behavior — even though they don’t have a tabindex value explicitly, they are still focusable (or not). When you’re making custom elements, though, currently there is no way to make them focusable by default, without setting the tabindex attribute. The proposal listed various ways this might be solved, and it was talked about in TPAC with a relatively positive response from various parties. Do check it out if you’re interested!

In summary, focus was and is still quite complex to understand. But, at least now there’s a clear source of truth for it, and the browser vendors are working to make it interoperable — implementing new changes to the spec as soon as they can. There are still lots of focus-related things that need to be specced (we’ve heard people mention focusin/focusout, more CSS selectors, etc.), so if you’re intrigued by this post, know that you can also contribute to fix more things like these!

Having an old and mostly confusing focus spec, different types of focus, and multiple uses for tabindex made things quite complicated when starting out. However, one of the trickiest parts of focus, in my opinion, is the fact that what is focusable/click focusable/sequentially focusable might differ in different UAs, and it might be dynamic as well! (E.g. in Safari what’s sequentially focusable changes if you hold down the Option key!) This means we need to make sure the spec is written to allow for differences, but is still, um, specific enough to make things not too ambiguous.

Overall, we’re happy with the result of this effort. We’d like to thank all the parties involved that participated in various ways: experimenting, reviewing spec PRs, implementing the changes, commenting/participating in discussions, etc. Special thanks to Kent Tamura (who wrote some parts of this post and did the autofocus specs) and Domenic Denicola (who kickstarted this whole effort, reviewed all the PRs, and suggested + reviewed this post).

Now we can focus on other parts of HTML Standard...

Posted in Browser API, DOM, Forms, Processing Model | 2 Comments »

The state of fieldset interoperability

Wednesday, September 19th, 2018

As part of my work at Bocoup, I recently started working with browser implementers to improve the state of fieldset, the 21 year old feature in HTML, that provides form accessibility benefits to assistive technologies like screen readers. It suffers from a number of interoperability bugs that make it difficult for web developers to use.

Here is an example form grouped with a <legend> caption in a <fieldset> element:

Pronouns He/him She/her They/them Write your own

And the corresponding markup for the above example.

<fieldset>
 <legend>Pronouns</legend>
 <label><input type=radio name=pronouns value=he> He/him</label>
 <label><input type=radio name=pronouns value=she> She/her</label>
 <label><input type=radio name=pronouns value=they> They/them</label>
 <label><input type=radio name=pronouns value=other> Write your own</label>
 <input type=text name=pronouns-other placeholder=&hellip;>
</fieldset>

The element is defined in the HTML standard, along with rendering rules in the Rendering section. Further developer documentation is available on MDN.

Usage

Based on a query of the HTTP Archive data set, containing the raw content of the top 1.3 million web pages, we find the relative usage of each HTML element. The fieldset element is used on 8.41% of the web pages, which is higher than other popular features, such as the video and canvas elements; however, the legend element is used on 2.46% of web pages, which is not ideal for assistive technologies. Meanwhile, the form element appears on 70.55% of pages, and we believe that if interoperability bugs were fixed, correct and semantic fieldset and legend use would increase, and have a positive impact on form accessibility for the web.

Fieldset standards history

In January 1997, HTML 3.2 introduces forms and some form controls, but does not include the fieldset or legend elements.

In July 1997, the first draft of HTML 4.0 introduces the fieldset and legend elements:

The FIELDSET element allows form designers to group thematically related controls together. Grouping controls makes it easier for users to understand their purpose while simultaneously facilitating tabbing navigation for visual user agents and speech navigation for speech-oriented user agents. The proper use of this element makes documents more accessible to people with disabilities.

The LEGEND element allows designers to assign a caption to a FIELDSET. The legend improves accessibility when the FIELDSET is rendered non-visually. When rendered visually, setting the align attribute on the LEGEND element aligns it with respect to the FIELDSET.

In December 1999, HTML 4.01 is published as a W3C Recommendation, without changing the definitions of the fieldset and legend elements.

In December 2003, Ian Hickson extends the fieldset element with the disabled and form attributes in the Proposed XHTML Module: XForms Basic, later renamed to Web Forms 2.0.

In September 2008, Ian Hickson adds the fieldset element to the HTML standard.

In February 2009, Ian Hickson specifies rendering rules for the fieldset element. The specification has since gone through some minor revisions, e.g., specifying that fieldset establishes a block formatting context in 2009 and adding min-width: min-content; in 2014.

In August 2018, I proposed a number of changes to the standard to better define how it should work, and resolve ambiguity between browser implementer interpretations.

Current state

As part of our work at Bocoup to improve the interoperability of the fieldset and legend child element, we talked to web developers and browser implementers, proposed changes to the standard, and wrote a lot of tests. At the time of this writing, 26 issues have been reported on the HTML specification for the fieldset element, and the tests that we wrote show a clear lack of interoperability among browser engines.

Of the 26 issues filed against the specification, 17 are about rendering interoperability. These rendering issues affect use cases such as making a fieldset scrollable, which currently result in broken scroll-rendering in some browsers. These issues also affect consistent legend rendering which is causing web developers avoid using the fieldset element altogether. Since the fieldset element is intended to help people who use assistive technologies to navigate forms, the current situation is less than ideal.

HTML spec rendering issues

In April of this year, Mozilla developers filed a meta-issue on the HTML specification “Need to spec fieldset layout” to address the ambiguities which have been leading to interoperability issues between browser implementations. During the past few weeks of work on fieldset, we made initial proposed changes to the rendering section of the HTML standard to address these 17 issues. At the time of this writing, these changes are under review.

Proposal to extend -webkit-appearance

Web developers also struggle with changing the default behaviors of fieldset and legend and seek ways to turn off the “magic” to have the elements render as normal elements. To address this, we created a proposal to extend the -webkit-appearance CSS property with a new value called fieldset and a new property called legend that are together capable giving grouped rendering behavior to regular elements, as well as resetting fieldset/legend elements to behave like normal elements.

fieldset {
  -webkit-appearance: none;
  margin: 0;
  padding: 0;
  border: none;
  min-inline-size: 0;
}
legend {
  legend: none;
  padding: 0;
}

The general purpose proposed specification for an "unprefixed" CSS ‘appearance’ property, has been blocked by Mozilla's statement that it is not web-compatible as currently defined, meaning that implementing appearance would break the existing behavior of websites that are currently using CSS appearance in a different way.

We asked the W3C CSS working group for feedback on the above approach, and they had some reservations and will develop an alternative proposal. When there is consensus for how it should work, we will update the specification and tests accordingly.

We had also considered defining new display values for fieldset and legend, but care needs to be taken to preserve web compatibility. There are thousands of pages in HTTP Archive that set ‘display’ to something on fieldset or legend, but browsers typically behave as display: block was set. For example, specifying display: inline on the legend needs to render the same as it does by default.

In parallel, we authored an initial specification for the ‘-webkit-appearance’ property in Mike Taylor's WHATWG Compatibility standard (which reverse engineers web platform wonk into status quo specifications), along with accompanying tests. More work needs to be done for the ‘-webkit-appearance’ (or unprefixed ‘appearance’) to define what the values mean and to reach interoperability on the supported values.

Accessibility Issues

We have started looking into testing accessibility explicitly, to ensure that the elements remain accessible even when they are styled in particular ways.

This work has uncovered ambiguities in the specification, which we have submitted a proposal to address. We have also identified interoperability issues in the accessiblity mapping in implementations, which we have reported.

Implementation fixes

Meta bugs have been reported for each browser engine (Gecko, Chromium, WebKit, EdgeHTML), which depend on more specific bugs.

As of September 18 2018, the following issues have been fixed in Gecko:

In Gecko, the bug Implement fieldset/legend in terms of '-webkit-appearance' currently has a work-in-progress patch.

The following issues have been fixes in Chromium:

The WebKit and Edge teams are aware of bugs, and we will follow up with them to track progress.

Conclusion

The fieldset and legend elements are useful to group related form controls, in particular to aid people who use assistive technologies. They are currently not interoperable and are difficult for web developers to style. With our work and proposal, we aim to resolve the problems so that they can be used without restrictions and behave the same in all browser engines, which will benefit browser implementers, web developers, and end users.

(This post is cross-posted on Bocoup's blog.)

Posted in Browsers, Forms, WHATWG | 1 Comment »

Google Tech Talk: HTML5 demos

Friday, September 26th, 2008

I gave a talk at Google on Monday demonstrating the various features of HTML5 that are implemented in browsers today. The video is now on YouTube, so now you too can watch and laugh at my lame presentation skills!

The segments of this talk are as follows. Some of the demos are available online for you to play with and are linked to from the following list:

Introduction
<video> (00:35)
postMessage() (05:40)
localStorage (15:20)
sessionStorage (21:00)
Drag and Drop API (29:05)
onhashchange (37:30)
Form Controls (40:50)
<canvas> (56:55)
Validation (1:07:20)
Questions and Answers (1:09:35)

If you're very interested in watching my typos, the high quality version of the video on the YouTube site is clear enough to see the text being typed. More details about the demos can be found on the corresponding demo page.

Posted in Browser API, Browsers, Conformance Checking, DOM, Elements, Events, Forms, Multimedia, Syntax, WHATWG | 7 Comments »

Web Forms 2.0 Cross-Browser Implementation

Monday, August 20th, 2007

A cross-browser implementation of Web Forms 2.0 has been released. The W3C HTML Working Group is now reviewing the Web Forms 2.0 specification to serve as a basis for the next version HTML (as everyone knows). Some discussion on the HTML WG mailing list regarding the WF2 specification has been at times ignorant of the content of the specification and what it actually enables authors to do. Hopefully by using this implementation and examining its associated test suite, the meaning of the specification will be clarified, and members of the HTML WG will be more educated so as to more constructively review it.

I started work on this project with an standalone implementation of the repetition model, but I have since expanded it to include other WF2 features as well. I developed this library on the job for incorporation into the various development projects I was working on; I have found the functionality provided by WF2, especially the repetition model and the form validation model, to greatly ease development. Try using it in your own projects and see for yourself, and then drop a line to WHATWG and the HTML WG with your feedback on what works well and what needs to be improved.

This implementation works in Firefox 1+, MSIE 6+, and Safari 2+ (Opera, of course, has a native implementation). For more information on the implemented features, see the implementation details. This implementation will follow the HTML 5 specification that evolves from the work done by the HTML Working Group.

Posted in Forms | 1 Comment »

Archive for the ‘Forms’ Category

Newline normalizations in form submission

First, some background on form submission

First signs of trouble

Normalizations with FormData and fetch

Investigating late normalization with the formdata event

Fixing the spec

Do we need the early normalization though?

Implementation fixes

Conclusion

Focusing on focus

Types of focus

tabindex

autofocus

Shadow DOM and delegatesFocus

The future, and outro

The state of fieldset interoperability

Usage

Fieldset standards history

Current state

HTML spec rendering issues

Proposal to extend -webkit-appearance

Accessibility Issues

Implementation fixes

Conclusion

Google Tech Talk: HTML5 demos

Web Forms 2.0 Cross-Browser Implementation

Normalizations with `FormData` and `fetch`

Investigating late normalization with the `formdata` event

`tabindex`

`autofocus`

Shadow DOM and `delegatesFocus`