XHTML5 in a nutshell

July 25th, 2010 by Sergey Mavrody

The WHATWG Wiki portal has a nice section describing HTML vs. XHTML differences, as well as specifics of a polyglot HTML document that also would be able to serve HTML5 document as valid XML document. I'd like to review what it takes to transform an HTML5 polyglot document into a valid XHTML5 document: it appears, finally the 'XHTML5' has become an official name.

The W3C first public working draft of "Polyglot Markup" recommendation describes polyglot HTML document as a document that conforms to both the HTML and XHTML syntax by using a common subset of both the HTML and XHTML and in a nutshell the HTML5 polyglot document is:

Polyglot document could serve as either HTML or XHTML, depending on browser support and MIME type. A polyglot HTML5 code essentially becomes XHTML5 document if it is served with the XML MIME type application/xhtml+xml . In a nutshell the XHTML5 document is: Finally, the basic XHTML5 document would look like this:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
<meta charset="UTF-8" />

</head>

<body>
<svg xmlns="http://www.w3.org/2000/svg">
<rect stroke="black" fill="blue" x="45px" y="45px" width="200px" height="100px" stroke-width="2" />
</svg>
</body>
</html>

The XML declaration <?xml version=”1.0” encoding=”UTF-8”?> is not required if the default UTF-8 encoding is used: an XHTML5 validator would not mind if it is omitted. However it is strongly recommended to configure the encoding using server HTTP Content-Type header, otherwise this character encoding could be included in the document as part of a meta tag <meta charset="UTF-8" />. This encoding declaration would be needed for a polyglot document so that it will be treated as UTF-8 if served as either HTML or XHTML.

The Total Validator Tool - Firefox plugin/desktop app has now the user-selectable option for XHTML5-specific validation.

I would say that the main advantage of using XHTML5 would be the ability to extend HTML5 to XML-based technologies such as SVG and MathML. The disadvantage is the lack of Internet Explorer support, more verbose code, and error handling. Unless we need that extensibility, HTML5 is the way to go.

This entry was posted on Sunday, July 25th, 2010 at 05:51 and is filed under Syntax, WHATWG, What's Next. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

9 Responses to “XHTML5 in a nutshell”

  1. Ylodi says:

    You are overdoing it with “don’t use XHTML” several times in a row.

  2. XHTML5.NL says:

    The doctype is useless in non-polyglot XHTML5. It’s used for validation but with the HTML5 doctype the only thing you can validate is the name of the root element. Triggering quirks mode is impossible in XHTML.

    As far as I know, rendering XHTML in IE8 requires a trick with XSLT.

    Specifying the charset in the meta tag does not work in XHTML. This is only useful if it’s a polyglot. Then again it doesn’t make sense since you the polyglot handling script (e.g. .htaccess or PHP) on the server already changes the Content-Type-header so for text/html it can easily include the charset parameter. Note that the XML declaration is illegal in text/html.

    Except for IE8 and below, all browser support XHTML. In some XHTML even renders faster than HTML. If we would have switched to XHTML 1 a long time ago, it would now be a lot easier to add new elements to the language without parsing problems. For programmers, scraping XHTML is a lot easier than HTML due to the wide availability of XML parsers. Also, the error handling makes it easier to write valid pages. It’s always been denied that XHTML has the future, but since IE9 has support, XHTML should eventually win for the web’s sake.

  3. Isn’t SVG and MathML a rather insignificant motivator? Since they’re both allowed in plain HTML5 documents as well. To a certain degree, XLink as well.

    There are certainly reasons to use XHTML documents, but the ability to embed SVG and MathML isn’t necessarily one of them.

  4. Sergey Mavrody says:

    XHTML5.NL, I have already said that the charset in the meta tag is useful if it’s a polyglot document, which I have not mentioned about the DOCTYPE tag – I will correct that, thank you.
    I wonder, how did you came up with the idea that “XHTML even renders faster than HTML”? XHTML markup is more verbose and therefor it would render slower.

  5. Sergey Mavrody says:

    Sebastian, thanks for the input. Yes SVG and MathML is supported inline by HTML5 specification, but HTML5 SVG and MathML browser support is limited even among latest non-beta browser versions. No support in IE8, FF3.6, Chrome 5.0, Safari 5.0. This could be a problem since large corporations got stuck with old browsers for years.

    And of course, there are other XML-based technologies not yet supported at all by HTML5.

  6. Michael says:

    The doctypes and the meta charset tags are useful if you want to test and validate your polyglot page before you have the opportunity to put it on a server.

    And XHTML can be faster for certain pages depending on the XML parser used. Since XML cannot be malformed there isn’t any code wasted on making it degrade gracefully. I don’t know that any current browsers use a fast XML parser though. Loading the file from the server would definitely be slower for the XML page, if the HTML page didn’t need all those IE hacks.

  7. XHTML5.NL says:

    “I wonder, how did you came up with the idea that “XHTML even renders faster than HTML”? XHTML markup is more verbose and therefor it would render slower.”

    I said that because of Chrome’s Accept header (something like application/xhtml+xml,text/html;q=0.9), but apparently that’s just a bug.

  8. John Thomas says:

    I think though the issue with extensibility isn’t if you need it, but if you might need it. Say some other markup language becomes as popular as say MathML and it would be really cool to start embedding it into documents. But maybe this markup had been developed outside of browsers for a while and used tags that conflict with some portions of HTML. Without namespaces, the only way to add this new functionality would be to make old pages obsolete or force detecting of old pages.

    Take a page for example, from CSS. Every browser has their own extensions, and are supposed to use the namespace-like concept of browser-prefixes. Except IE didn’t, which might have been fine except that SVG has a css rule called “filter”. Suddenly a lot of stylesheets are in trouble.

  9. billyswong says:

    To those who say XHTML could be faster than HTML:

    XHTML documents may indeed be a bit faster to be parsed, but the bottleneck is never the parser. It’s those stupid scripts (from advertisers) and network speed (which neither sides may be able to control).

    Currently browsers rely heavily on incremental rendering for giving users the best experiences. Internet connections and servers situation may not be always perfect. There are times that the whole HTML/XHTML file take multiple seconds or a significant fraction of a minute. In the worst case, the connection ends and the file is incomplete and truncated.

    In the case of HTML, the issue is minor as there exists a graceful degradation provided by the browser. But if XHTML parser enforces strict conformation and won’t produce DOM trees for truncated documents, all the users can see is a blank page. I’ve experienced it once before and it doesn’t feel good. “You can’t show me what you’ve downloaded yet because there’s still 1% of the XHTML file stuck somewhere in the network tube and not yet received?” See, it will be very annoying when one is using a unstable connection.

    So at the end, those “no graceful degradation allowed” wonderland is never acceptable to the real world. Just stick with HTML; we need reliability. And forced-conformation XHTML will never feel faster to the general public, never.

    (Oh by the way, are there anybody willing to make a graceful-degradable XHTML parser and/or parsing rule? This may be the only opportunity XHTML could shine.)

Leave a Reply