The WHATWG Blog

Please leave your sense of logic at the door, thanks!

The Road to HTML 5: Link Relations

Friday, April 17th, 2009

Welcome back to my semi-regular column, "The Road to HTML 5," where I'll try to explain some of the new elements, attributes, and other features in the upcoming HTML 5 specification.

The feature of the day is link relations.

In this article:

What are link relations?

Regular links (<a href>) simply point to another page. Link relations are a way to explain why you're pointing to another page. They finish the sentence "I'm pointing to this other page because..."

And so on. HTML 5 breaks link relations into two categories:

Two categories of links can be created using the link element. Links to external resources are links to resources that are to be used to augment the current document, and hyperlink links are links to other documents. ...

The exact behavior for links to external resources depends on the exact relationship, as defined for the relevant link type.

Of the examples I just gave, only the first (rel=stylesheet) is a link to an external resource. The rest are hyperlinks to other documents. You may wish to follow those links, or you may not, but they're not required in order to view the current page.

Common link relations include <link rel=stylesheet> (for importing CSS rules) and <link rel=alternate type=application/atom+xml> (for Atom feed autodiscovery). HTML 4 defines several link relations; others have been defined by the microformats community. HTML 5 attempts to consolidate all the known link relations, clean up their definitions (if necessary), and then provide a central registry for future proposals.

How can I use link relations?

Most often, link relations are seen on <link> elements within the <head> of a page. Some link relations can also be used on <a> elements, but this is uncommon even when allowed. HTML 5 also allows some relations on <area> elements, but this is even less common. (HTML 4 did not allow a rel attribute on <area> elements.)

See the full chart of link relations to check where you can use specific rel values.

Changes to link relations since HTML 4

Link relations were added to the HTML 5 spec in November 2006. (Back then the spec was still called "Web Applications 1.0.") r319 kicked off a flurry of rel-related activity. The original additions were primarily based on research of existing web content in December 2005, using Google's cache of the web at the time. Since then, other relations have been added, and a few have been dropped.

rel=alternate

rel=alternate has always been a strange hybrid of use cases, even in HTML 4. In HTML 5, its definition has been clarified and extended to more accurately describe existing web content. For example, using rel=alternate in conjunction with the type attribute indicates the same content in another format. Using rel=alternate in conjunction with type=application/rss+xml or type=application/atom+xml indicates an RSS or Atom feed, respectively.

HTML 5 also puts to rest a long-standing confusion about how to link to translations of documents. HTML 4 says to use the lang attribute in conjunction with rel=alternate to specify the language of the linked document, but this is incorrect. The HTML 4 Errata lists four outright errors in the HTML 4 spec (along with several editorial nits); one of these outright errors is how to specify the language of a document linked with rel=alternate (The correct way, described in the HTML 4 Errata and now in HTML 5, is to use the hreflang attribute.) Unfortunately, these errata were never re-integrated into the HTML 4 spec, because no one in the W3C HTML Working Group was working on HTML anymore.

rel=archives

New in HTML 5

rel=archives "indicates that the referenced document describes a collection of records, documents, or other materials of historical interest. A blog's index page could link to an index of the blog's past posts with rel="archives"."

rel=author (and the removal of the rev attribute)

New in HTML 5

rel=author is used to link to information about the author of the page. This can be a mailto: address, though it doesn't have to be. It could simply link to a contact form or "about the author" page.

rel=author is equivalent to the rev=made link relation defined in HTML 3.2. Despite popular belief, HTML 4 does not include rev=made, effectively obsoleting it. (You can search the entire spec for the word "made" if you don't believe me.)

Given that rev=made was the only significant non-typo usage of the rev attribute, HTML 5 added rel=author to make up for the loss of rev=made in HTML 4, thus allowing the working group to obsolete the rev attribute altogether. Other than the un/semi/sortof-documented rev=made value, people typo the "rev" attribute more often than they intentionally use it, which suggests that the world would be better off if validators could flag it as non-conforming.

The decision to drop the rev attribute seems especially controversial. The same question flares up again and again on the working group's mailing list: "what happened to the rev attribute?" But in the face of almost-universal misunderstanding (among people who try to use it) and apathy (among everyone else), no one has ever made a convincing case for keeping it that didn't boil down to "I wish the world were different." Hey, so do I, man. So do I.

rel=external

New in HTML 5

rel=external "indicates that the link is leading to a document that is not part of the site that the current document forms a part of." I believe it was first popularized by WordPress, which uses it on links left by commenters. I could not find any discussion of it in the HTML working group mailing list archives. Both its existence and its definition appear to be entirely uncontroversial.

rel=feed?

New in HTML 5, but may not be long for this world

rel=feed "indicates that the referenced document is a syndication feed." Right away, you're thinking, "Hey, I thought you were supposed to use rel=alternate type=application/atom+xml to indicate that the referenced document is a syndication feed." In fact, that's what everyone does, and that's what all browsers support. Firefox 3 is the only browser that supports rel=feed. (It also supports rel=alternate type=application/atom+xml.) The rel=feed variant was proposed in the Atom working group in 2005 and somehow found its way into HTML 5. Just yesterday, I was discussing whether HTML 5 should drop rel=feed due to lack of browser implementation and complete and utter lack of author awareness.

rel=first, last, prev, next, and up

HTML 4 defined rel=start, rel=prev, and rel=next to define relations between pages that are part of a series (like chapters of a book, or even posts on a blog). The only one that was ever used correctly was rel=next. People used rel=previous instead of rel=prev; they used rel=begin and rel=first instead of rel=start; they used rel=end instead of rel=last. Oh, and -- all by themselves -- they made up rel=up to point to a "parent" page.

HTML 5 includes rel=first, which was the most variation of the different ways to say "first page in a series." (rel=start is a non-conforming synonym, for backward compatibility.) Also rel=prev and rel=next, just like HTML 4 (but mentioning rel=previous for back-compat). It also adds rel=last (the last in a series, mirroring rel=first) and rel=up.

The best way to think of rel=up is to look at your breadcrumb navigation (or at least imagine it). Your home page is probably the first page in your breadcrumbs, and the current page is at the tail end. rel=up points to the next-to-the-last page in the breadcrumbs.

rel=icon

New in HTML 5

rel=icon is the second most popular link relation, after rel=stylesheet. It is usually found together with shortcut, like so:

<link rel="shortcut icon" href="/favicon.ico">

All major browsers support this usage to associate a small icon with the page (usually displayed in the browser's location bar next to the URL).

Also new in HTML 5: the sizes attribute can be used in conjunction with the icon relationship to indicate the size of the referenced icon. [sizes example]

rel=license

New in HTML 5

rel=license was invented by the microformats community. It "indicates that the referenced document provides the copyright license terms under which the current document is provided."

rel=nofollow

New in HTML 5

rel=nofollow "indicates that the link is not endorsed by the original author or publisher of the page, or that the link to the referenced document was included primarily because of a commercial relationship between people affiliated with the two pages." It was invented by Google and standardized within the microformats community. The thinking was that if "nofollow" links did not pass on PageRank, spammers would give up trying to post spam comments on weblogs. That didn't happen, but rel=nofollow persists. Many popular blogging systems default to adding rel=nofollow to links added by commenters.

rel=noreferrer

New in HTML 5

rel=noreferrer "indicates that the no referrer information is to be leaked when following the link." No browser currently supports this. [rel=noreferrer test case]

rel=pingback

New in HTML 5

rel=pingback specifies the address of a "pingback" server. As explained in the Pingback specification, "The pingback system is a way for a blog to be automatically notified when other Web sites link to it. ... It enables reverse linking -- a way of going back up a chain of links rather than merely drilling down."

Blogging systems, notably WordPress, implement the pingback mechanism to notify authors that you have linked to them when creating a new blog post.

rel=prefetch

New in HTML 5

rel=prefetch "indicates that preemptively fetching and caching the specified resource is likely to be beneficial, as it is highly likely that the user will require this resource." Search engines sometimes add <link rel=prefetch href="URL of top search result"> to the search results page if they feel that the top result is wildly more popular than any other. For example: using Firefox, search Google for CNN; view source; search for the keyword "prefetch".

Mozilla Firefox is the only current browser that supports rel=prefetch.

New in HTML 5

rel=search "indicates that the referenced document provides an interface specifically for searching the document and its related resources." Specifically, if you want rel=search to do anything useful, it should point to an OpenSearch document that describes how a browser could construct a URL to search the current site for a given keyword.

OpenSearch (and rel=search links that point to OpenSearch description documents) is supported in Microsoft Internet Explorer since version 7 and Mozilla Firefox since version 2.

rel=sidebar

New in HTML 5

rel=sidebar "indicates that the referenced document, if retrieved, is intended to be shown in a secondary browsing context (if possible), instead of in the current browsing context." What does that mean? In Opera and Mozilla Firefox, it means "when I click this link, prompt the user to create a bookmark that, when selected from the Bookmarks menu, opens the linked document in a browser sidebar." (Opera actually calls it the "panel" instead of the "sidebar.")

Internet Explorer, Safari, and Chrome ignore rel=sidebar and just treat it as a regular link. [rel=sidebar test case]

rel=tag

New in HTML 5

rel=tag "indicates that the tag that the referenced document represents applies to the current document." Marking up "tags" (category keywords) with the rel attribute was invented by Technorati to help them categorize blog posts. Early blogs and tutorials thus referred to them as "Technorati tags." (You read that right: a commercial company convinced the entire world to add metadata that made the company's job easier. Nice work if you can get it!) The syntax was later standardized within the microformats community, where it was simply called "rel=tag".

Most blogging systems that allow associating categories, keywords, or tags with individual posts will mark them up with rel=tag links. Browsers do not do anything special with them, but they're really designed for search engines to use as a signal of what the page is about.

rel=contact

rel=contact was briefly part of HTML 5, but r1711 removed it because it conflicted with the same-named XFN relationship.

Extending rel even further

There seems to be an infinite supply of ideas for new link relations. In an attempt to prevent people from just making shit up, the WHATWG maintains a registry of proposed rel values and defines the process for getting them accepted.

Posted in Tutorials | 39 Comments »

The Road to HTML 5: getElementsByClassName()

Tuesday, November 11th, 2008

Welcome back to my semi-regular column, "The Road to HTML 5," where I'll try to explain some of the new elements, attributes, and other features in the upcoming HTML 5 specification.

The feature of the day is getElementsByClassName(). Long desired by web developers and implemented in Javascript libraries like Prototype, this function does exactly what it says on the tin: it returns a list of elements in the DOM that define one or more classnames in the class attribute. getElementsByClassName() exists as a method of the document object (for searching the entire DOM), as well as on each HTMLElement object (for searching the children of an element).

The HTML 5 specification defines getElementsByClassName():

The getElementsByClassName(classNames) method takes a string that contains an unordered set of unique space-separated tokens representing classes. When called, the method must return a live NodeList object containing all the elements in the document, in tree order, that have all the classes specified in that argument, having obtained the classes by splitting a string on spaces. If there are no tokens specified in the argument, then the method must return an empty NodeList. If the document is in quirks mode, then the comparisons for the classes must be done in an ASCII case-insensitive manner, otherwise, the comparisons must be done in a case-sensitive manner.

A Brief History of getElementsByClassName()

Can We Use It?

Yes We Can! As you can tell from the timeline, getElementsByClassName() is supported natively in Firefox 3, Opera 9.5, Safari 3.1, and all versions of Google Chrome. It is not available in any version of Microsoft Internet Explorer. (IE 8 beta 2 is the latest version as of this writing.) To use it in browsers that do not support it natively, you will need a wrapper script. There are many such scripts; I myself am partial to Robert Nyman's Ultimate GetElementsByClassName. It uses the native getElementsByClassName() method in modern browsers that support it, then falls back to the little-known document.evaluate() method, which is supported by older versions of Firefox (since at least 1.5) and Opera (since at least 9.27). If all else fails, Robert's script falls back to recursively traversing the DOM and collecting elements that match the given classnames.

And in conclusion

getElementsByClassName() is well-supported across all modern browsers except IE, and a performance-optimized open source wrapper script can cover IE and older browsers.

Posted in Tutorials, WHATWG | 7 Comments »