The WHATWG Blog

Please leave your sense of logic at the door, thanks!

The Road to HTML 5 – Episode 1: the section element

by Mark Pilgrim, Google in Tutorials, Weekly Review

Welcome to a new semi-regular column, "The Road to HTML 5," where I'll try to explain some of the new elements, attributes, and other features in the upcoming HTML 5 specification.

The element of the day is the <section> element.

The section element represents a generic document or application section. A section, in this context, is a thematic grouping of content, typically with a header, possibly with a footer. Examples of sections would be chapters, the various tabbed pages in a tabbed dialog box, or the numbered sections of a thesis. A Web site's home page could be split into sections for an introduction, news items, contact information.

Discussion of sections and headers dates back several years. In November 2004, Ian Hickson wrote:

Basically I want three things:

  1. It has to be possible to take existing markup (which correctly uses <h1>-<h6>) and wrap the sections up with <section> (and the other new section elements) and have it be correct markup. Basically, allowing authors to replace <div class="section"> with <section>, <div class="post"> with <article>, etc.
  2. It has to be possible to write new documents that use the section elements and have the headers be automatically styled to the right depth (and maybe automatically numbered, with appropriate CSS), and yet still be readable in legacy UAs, without having to think about old UAs. Basically, the header element has to be header-like in old browsers.
  3. It shouldn't be too easy to end up with meaningless markup when doing either of the above. So a random <h4> in the middle of an <h2> and an <h3> has to be defined as meaning _something_.

At the moment what I'm thinking of doing is this (most of these ideas are in the draft at the moment, but mostly in contradictory ways):

The section elements would be:

<body> <section> <article> <navigation> <sidebar>

The header elements would be:

<header> <h1> <h2> <h3> <h4> <h5> <h6>

<h1> gives the heading of the current section.

<header> wraps block-level content to mark the whole thing as a header, so that you can have, e.g., subtitles, or "Welcome to" paragraphs before a header, or "Presented by" kind of information. <header> is equivalent to an <h1>. The first highest-level header in the <header> is the "title" of the section for outlining purposes.

<h2> to <h6> are subsection headings when used in <body>, and equivalent to <h1> when used in one of the section elements.

<h1> automatically sizes to fit the current nesting depth. This could be a problem in CSS since CSS can't handle this kind of thing well -- it has no "or" operator at the simple selector level.

<h2>-<h6> keep their legacy renderings for compatibility.

Further discussion:

Fast-forward to modern times. Using the <section> element instead of, say, <div class="section">, seems like a no-brainer. Unfortunately, there's a catch. (Hey, it's the web; there's always a catch.) Not all modern browsers recognize the <section> element, which means that they fall back to their default handling of unknown elements.

A long digression into browsers' handling of unknown elements

Every browser has a master list of HTML elements that it supports. For example, Mozilla Firefox's list is stored in nsElementTable.cpp. Elements not in this list are treated as "unknown elements." There are two fundamental problems with unknown elements:

  1. How should the element be styled? By default, <p> has spacing on the top and bottom, <blockquote> is indented with a left margin, and <h1> is displayed in a larger font.
  2. What should the element's DOM look like? Mozilla's nsElementTable.cpp includes information about what kinds of other elements each element can contain. If you include markup like <p><p>, the second paragraph element implicitly closes the first one, so the elements end up as siblings, not parent-and-child. But if you write <p><span>, the span does not close the paragraph, because Firefox knows that <p> is a block element that can contain the inline element <span>. So the <span> ends up as a child of the <p> in the DOM.

Different browsers answer these questions in different ways. (Shocking, I know.) Of the major browsers, Microsoft Internet Explorer's answer to both questions is the most problematic.

The first question should be relatively simple to answer: don't give any special styling to unknown elements. Just let them inherit whatever CSS properties are in effect wherever they appear on the page, and let the page author specify all styling with CSS. Unfortunately, Internet Explorer does not allow styling on unknown elements. For example, if you had this markup:

<style type="text/css">
  section { border: 1px solid red }
</style>
...
<section>
<h1>Welcome to Initech</h1>
<p>This is our <span>home page</span>.</p>
</section>

Internet Explorer (up to and including IE8 beta 2) will not put a red border around the section.

The second problem is the DOM that browsers create when they encounter unknown elements. Again, the most problematic browser is Internet Explorer. If IE doesn't explicitly recognize the element name, it will insert the element into the DOM as an empty node with no children. All the elements that you would expect to be direct children of the unknown element will actually be inserted as siblings instead. I've posted an ASCII graph that illustrates this mismatch.

Sjoerd Visscher discovered a workaround for this problem: after you create a dummy element with that name, IE will recognize the element enough to let you style it with CSS. You can put the script in the <head> of your page, and there is no need to ever insert it into the DOM. Simply creating the element once (per page) is enough to teach IE to style the element it doesn't recognize. Sample code and markup:

<html>
<head>
<style type="text/css">
  section { display: block; border: 1px solid red }
</style>
<script type="text/javascript">
  document.createElement("section");
</script>
</head>
<body>
<section>
<h1>Welcome to Initech</h1>
<p>This is our <span>home page</span>.</p>
</section>
</body>
</html>

This hack works in IE 6, IE 7, and IE 8 beta 1, but it doesn't work in IE 8 beta 2. (bug report, test case) The purpose of this illustration is not to blame IE; there's no specification that says what the DOM ought to look like in this case, so IE's handling of the "unknown element" problem is not any more or less correct than any other browser. With the createElement workaround, you can use the <section> element (or any other new HTML 5 element) in all browsers except IE 8 beta 2. I am not aware of any workaround for this problem.

And in conclusion

The <section> element is a very straightforward HTML 5 feature that you can't actually use yet.

8 Responses to “The Road to HTML 5 – Episode 1: the section element”

  1. This description of IE handling of custom elements doesn’t match my experience.
    If I use
    document.createElement(“section”)
    and my styling rules include the equivalent of
    section { display: block }
    then everything seems fine. I can only test with IE6, so maybe that is different.

    My test page is
    http://www.meekostuff.net/test/html/section.html
    It has the following styles
    section { display: block; border: 1px solid red; }
    section p { border: 1px solid green; }
    and after a second it spits out a text representation of the tree based at the section element.

  2. Sjoerd Visscher discovered a workaround for this problem: after you create a dummy element with that name, IE’s CSS parser will recognize the element enough to let you style it with CSS.

    It’s not about the CSS parser. It’s the HTML parser that changes behavior. IE will treat that element the same it treats elements with colons in them: “/>” will self-close the element and case will be preserved in the DOM, and they can have children.

    http://software.hixie.ch/utilities/js/live-dom-viewer/?%3Cscript%3E%20document.createElement('section‘)%20%3C%2Fscript%3E%0D%0A%3Cstyle%3E%20section%20%7B%20border%3Asolid%20%7D%20%23foo%20%7B%20display%3Ablock%20%7D%20%3C%2Fstyle%3E%0D%0A%3Cbody%3E%0D%0A%20%3CSECTION%2F%3E%3Cp%3Efoo%3Cbr%3Ebar%3C%2Fp%3E%3C%2Fsection%3E%0D%0A%20%3Csection%3E%3Cp%3Ebar%3Cbr%3Ebaz%3C%2Fp%3E%3C%2Fsection%3E%0D%0A%20%3Csection%20id%3Dfoo%3E%3Cp%3Efoo%3Cbr%3Ebar%3C%2Fp%3E%3C%2Fsection%3E

    We still won’t actually get a red border around the section. To understand why, we need to answer the second fundamental question.

    Works for me in IE6 with the test above. Note that the default styling is inline so you need to set it to display:block to get a nice border around the element, and this applies to all browsers.

    If IE doesn’t explicitly recognize the element name, it will insert the element into the DOM as an empty node with no children.

    This is what happens when you don’t do the createElement trick. Although you also get an empty element for the end tag (named “/SECTION”).

  3. I’m working off of memory here, but in WAI-ARIA there are the ideas of roles and landmarks (landmarks are from XHTML, I think). There seems to be a lot of commonality with ARIA in the section element, in fact, I know there’s a section role in ARIA. If you are so inclined, I’d like to hear your thoughts on if and how the ARIA and HTML5 groups are working together. There seems to be a lot of overlap and possibility for confusion for developers in how to use both of these.

  4. I’ll blame IE anyway because the default handling exhibits a lack of foresight 😉 New tags were going to come up in the future.

    All these tags are fairly generic. From H1-H2, P, and these new section tags, there’s not many special behaviors to worry about. Their value is strictly semantics. Visually, they can all be styled to look identical and behave identically. It’s not like textareas and anchors which have unique functionality attached.

    So why not treat an unknown in the same manner you treat generic markup?

    The big problem I see is the temptation for developers to shoot themselves in the foot by abusing this ability and simply declaring their own subset of radical tags. We might just be doing it if in fact IE 6 and 7 behaved intuitively in this regard.