XHTML2 Explorations, Part I

You read that right: “XHTML2 Explorations”. Yes, the fun never stops here at goer.org.

I’ve decided to take a closer look at XHTML2, or more specifically, XHTML2 Working Draft 6. I’ll admit that I haven’t done a good job slogging through the thousands of messages on the W3C lists. I’m just a casual observer.

Fortunately, there’s been plenty of weblog chatter over the <object> replacing <img>, <cite> getting dropped and then added back, the excitement over <blockcode>, the battle over the style attribute, navigation lists, the new <h> and <section> model, the “href on everything” model, and more. The Alphas have been discussing these issues for months, and we Gammas have been well-served by just listening in.

But even with all the healthy public discussion, XHTML2 is a big specification (430 KB and counting). At least for my own edification, it’s time to see how deep the rabbit hole goes.

A Swarm of Attributes

XHTML2 provides a huge number of common attributes that are divided up into collections. Most XHTML2 elements accept attributes from all collections. The most well-known example of this is the “everything is a hyperlink” concept, wherein you can turn any XHTML2 element into a link by applying the href attribute.

But of course there’s much more. Consider the set of “common” attributes in HTML 4.01: there’s id, style, class, dir, lang, title, and the “event” attributes (such as onmouseover). This yields a little over 15 attributes, depending on how you’re counting. In contrast, XHTML2 already provides around 30 common attributes. Let’s take a closer look.

The Edit Collection

The Edit Collection provides an edit attribute with four allowed values: inserted, deleted, changed, and moved. Presumably if something is moved, you can specify where it moved to by using the href attribute. There’s also a datetime attribute for specifying the timestamp, the format of which is defined in XML Schema, which references ISO 8601, which has since been revised. Whew.

The default presentation should be display: none for deleted markup, while the other three types should be displayed as-is. Note that if we assume that the XHTML2 browsers of the future will have solid CSS2 support, then we can write:

  *[edit="deleted"] {
    display: inline;
    color: red;
    text-decoration: line-through;
  }

I’ll admit I like the idea of having a simple change record facility. However, there doesn’t appear to be an editedby attribute, which seems like a bit of an oversight.

The Embedding Collection

Through the Embedding collection, each element can have a src attribute. The browser attempts to replace the element’s content with the embedded file or resource. If the embedding fails for some reason, the browser proceeds to process the contents of the element. The spec provides an example involving a table:

  <table src="temperature-graph.png" type="image/png">
    <caption>Monthly temperatures</caption>
    ... (lots of table rows and cells) ...
  </table>

The spec also declares,

Note that this behavior makes documents far more robust, and gives much better opportunities for accessible documents than the longdesc attribute present in earlier versions of XHTML, since it allows the description of the resource to be included in the document itself, rather than in a separate document.

I scratched my head over the new src attribute for a while… but if you think of it as a replacement for the longdesc attribute, then it sort of makes sense. A couple of points, though.

First, a quibble with the W3C’s example — I’m not quite sure how replacing a perfectly good XHTML table with a PNG image constitutes a great leap forward under any circumstances.

Second, the W3C recommends:

This collection causes the contents of a remote resource to be embedded in the document in place of the element’s content. If accessing the remote resource fails, for whatever reason (network unavailable, no resource available at the URI given, inability of the user agent to process the type of resource) the content of the element must be processed instead.

Maybe I don’t understand this statement correctly, but I take the “instead” to mean that browsers should not continue to process content if the remote resource is accessible. If so, I certainly hope that the browsers ignore this recommendation. The browser doesn’t need to display the child content directly, but it needs to process the entire document and provide access to the child content somehow. Otherwise, accessibility gets worse, not better.

Finally, this concept opens up all sorts of interesting UI issues. Take the <table> example above. If I use my browser’s “Find Text In This Page” function, will the browser search the text in the table cells? How will it highlight successful matches?

The Cite Attribute

The Hypertext collection permits any element to have a cite attribute. At first glance, I thought that this made the <cite> element redundant. But as Mark Pilgrim points out in Semantic Obsolescence:

“No, the cite tag and the cite attribute are not the same thing. The cite attribute is a URL; the cite tag is wrapped around actual names within your text.”

Fair enough. However, I’ve been thinking that this might be an oversight on the part of the W3C, and the cite attribute should be allowed to contain arbitrary text. For example:

  <h1>Ask Dr. Science!</h1>

  <p>Q: Dr. Science, why is the sky blue? -- Ashley, age 8</p>

  <p>A: Glad you asked, Ashley!  The answer is simple, really:</p>

  <blockquote
    cite="Jackson, J.D. 1975. Classical Electrodynamics, 2nd. ed. 
          New York: John Wiley and Sons">
  <p>
    The scattering of light by gases, first treated quantitatively by Lord
    Rayleigh in his celebrated work on the sunset and blue sky, can be 
    discussed in the present framework.  Since the magnetic moments 
    of most gas molecules are neglible compared to the electric dipole
    moments, the scattering is purely electric dipole in character.  In 
    the previous section we have discussed the angular distribution and
    polarization of the individual scatterings (see Figure 9.6).  We 
    therefore confine our attention to the total scattering cross section
    and the attenuation of the incident beam.  The treatment is in two 
    parts...
  </p>
  </blockquote>

The cite attribute provides an elegant way to scope sections of a document as belonging somewhere else, so why limit it only to stuff on the web? As for the <cite> element, it would still be good for explicitly marking up citations (such as the ones found at the end of a journal article). Well, just a thought.

New LinkType Options

The rel attribute, once the sole province of the <link> element and <a> element, is now universal. The allowed values are defined by the LinkType data type. There are three new options:

  • parent: You may now specify a link as a parent document. Strangely, you can’t specify your children or siblings. After all, it’s kinda hard to construct a full tree without information about the children. Oh heck, let’s just say it: won’t somebody think of the children?

  • meta: The link “provides metadata, for instance in RDF, about the current document.” This allows you to place your metadata directly in the body content of your document. Not sure why you would want to do this as opposed to using the good old fashioned <meta> tag, but ours is not to question why.

    Speaking of <meta> element, the spec states that “A common use for meta is to specify keywords that a search engine may use to improve the quality of search results. When several meta elements provide language-dependent information about a document, search engines may filter on the xml:lang attribute to display search results using the language preferences of the user.” The idea that people will provide accurate keywords in the first place, let alone scope these keywords appropriately according to language, seems a quaint notion at best.

  • p3pv1: When I first read this, the first thought that popped into my head was, “What’s the ‘v1’ doing in there?” The second thought that popped into my head was, “What’s the p3p doing in there?” I have no idea why we would want to reference a particular technology here, let alone a particular version of a particular technology. The W3C should rename this one to “privacy” post-haste.

Note: Part II is now available.

17 thoughts on “XHTML2 Explorations, Part I

  1. My two favorite answers to the question about the sky being blue:

    1) “The sky is blue becauase if it were green we wouldn’t know where to stop mowing our lawns.”
    -Hon. Harry Anderson on Night Court. circa 1985.

    2) “The sky is blue because if it weren’t, it would be white and the clouds would be camouflaged against it.”
    -My 4 year old nephew Toby. circa 3 months ago.
    (Toby is better known for his proclaimation that the owner of this site was “Big Giant Huge Evan” when being compared to my then 1 year old nephew Evan.)

    Please excuse the fact that I ignored the point of the log. I would hard pressed to sound funny OR erudite on the subject.

  2. Considering that Evan the Elder is, um, vertically challenged, as are all the members of my dear family, Toby’s comment regarding his uncle-by-marriage (sort of) points up that everything is relative, so to speak.

    — Henci

  3. Hello Karl,

    Quite so! I freely admit that HTML 4.01 is a big specification also. 🙂

    The reason I wrote this post is because although I am aware of some of the highlights of XHTML2 (mentioned in the preface), I am *much* less familiar with the nooks and crannies. Unlike XHTML1.0, XHTML2’s elements and attributes differs from HTML 4.01 in *many* substantative ways. Hence the spec looks big — to me, anyway.

    You are also quite right in that the raw KB of a spec is a bad way to measure its size. It’s a bad choice in general, and it’s particularly bad now, while the XHTML2 specification still has a lot of open issues and sections to be filled in. My guess is that once they address these issues, provide a full schema, and so on, the two specs will be pretty similar in size however we measure it.

  4. About <meta> elements: it makes more sense when you consider that many people use them for local search engines.

  5. I think you’re looking at The Embedding Collection all wrong. Take the example you discussed here:

    <table src=”temperature-graph.png” type=”image/png”>
    <caption>Monthly temperatures</caption>
    … (lots of table rows and cells) …
    </table>

    You said: Finally, this concept opens up all sorts of interesting UI issues. Take the example above. If I use my browser’s “Find Text In This Page” function, will the browser search the text in the table cells? How will it highlight successful matches?

    I don’t think the W3C is recomending we replace our tabular data with PNG images. A better way to look at it would be that the above example is replacing this:

    <p>Here’s a chart:
    <img src=”chart.png” alt=”A lot of data that you can’t access with your screen reader and stuff” />

    So, even if the search function would be difficult to impliment on the former mark-up example, it would still be more accessable than the latter example.

  6. Hello, Jim. I suppose if we’re considering local search engines, then we can assume that people won’t falsify their meta tags. That’s a start, anyway. But there’s still the “Laziness” problem, which plagues personal sites, corporate intranets, you name it. The only solution to the Laziness problem is to require all your web writers to use a tool that *forces* people to enter keywords. (And then we have to trust that people are smart enough to assign the *right* keywords.)

    And hello to you too, Spencer. I don’t think we disagree. The Embedding Collection does make sense to me (as long as I repeat to myself, “It’s a replacement for longdesc, it’s a replacement for longdesc.”) And I can see good uses for it, such as providing a well-marked up description for a photograph. I’m just quibbling with one of the W3C’s examples. The W3C’s example is saying, “Here’s a PNG image — and by the way, the alt text for that PNG image is an XHTML table.” I’m saying, “An XHTML table is generally superior to a PNG image of a table, so why EVER make it secondary?” Like I said, this is a quibble — that’s one bad usage of src, but surely there are many good ones.

    A slightly more serious issue: the spec says that the browser should stop processing the child content as soon as it is able to access and display the remote resource. If so, then we wouldn’t have *any* access to the child content at all… would we? (Of course, I could be reading this all wrong. Maybe I just don’t understand what the spec means when it says, “processed”.)

  7. 1)Whether the browser should allow text-search hits on “hidden” elements is not new to XHTML 2.0. Consider:

    <p style=”display:none”>Piss on you!</p>

    Should the browser find the bad words hidden on the page and, if so, how should it display them?

    2) With respect to the cite attribute, I’m not sure I like the idea of attributes which could *either* be a URL or arbitrary arbitrary text.

    If you have an attribute which must be a URL, that’s an opportunity for browser writers to design an interface that allows users to jump to that URL.

    If you have an attribute that is arbitrary text, that’s an opportunity for browser writers to design an interface to display it (tooltip-style, say).

    If you have an attribute which could be either, I think that’s an invitation to browser writers to ignore it.

    3) <link rel=”meta”> provides a *link* to external metadata file, as opposed to embedding the metadata inline (which, in the case of RDF is — to this day — a dodgy proposition).

  8. Oh, and a table is rarely a “superior” alternative to a graph. Most people have a hard time seeing significant features of a dataset when presented as a table; when presented as a graph, those features often stick out like a sore thumb. (See the graph linked to in my recent “Exotic Baryons” post.)

  9. Hey Jacques.

    1. Granted, it’s not the first instance of this problem.

    2. But don’t you think “cite” should mean “a citation”, not “a citation only for things with URLs”? I don’t think it would be such a nightmare to implement. (It could be a conditionally-clickable tooltip, who knows?) Besides, it’s not like the spec is all that friendly to our friends the browser writers. Consider the Image Map attribute collection, which turns every element into a potential image map. I’m not a programmer, but *that* one sounds like a PITA to me.

    3. <link rel=”meta”> is not the issue. I was talking about <anyTag rel=”meta”>. rel=”meta” now means, “This is meta information”, whether the content appears in the body or not. Also note that the <meta> tag now has an href attribute, making it functionally identical to the <link> in its ability to refer to remote meta information.

    4. But Jacques, this XHTML2! In our not-too-distant Glorious Semantic Future, I’ll be able to grab your table, slice it, dice it, transform it, convert it to an SVG graph, find a best curve-fit… I can’t do that with a crummy PNG image, now can I? 🙂

  10. 2) I agree that “cite” probably means citation, which ought to be good for sources which don’t have URLs too. In fact, there are sources for which there is a formal ‘citation’ (a string) *and* a URL where the source can be found online. In cases like that, which would you want as the value of the “cite” attribute? Maybe there should be a “cite” attribute (value is a string) and a “citeurl” attribute (value is a URL).

    3) Sorry, I didn’t mean <link>. How about <div class=”blogpost” rel=”meta” href=”[file with metadata about this post]”> .

    4) Why should you have to do the work client-side? I’m doing it server-side. With nested object tags, I’m giving you an SVG graphic; if your browser can’t handle that, you can view a PNG; and if that doesn’t work for you, you can still get the raw data as an XHTML table.

    There does seem to be an ambiguity, though, as to whether the Spec allows a browser which *does* support one layer of object (say the PNG graphic) to dig deeper and get at the content (the XHTML table). Maybe you want to do that for accessibility reasons; maybe you want to do that so you can plug the raw data into you Mathematica XII plugin, which will do nifty things with it. Anyway, the Spec should clarify whether that’s allowed.

  11. 2. Hmmm, “cite” and “citeurl” — I like it!

    3. Okay. But I’m still confused about rel=”meta”. The href attribute is optional… so if I leave it off, does that mean that everything under <div rel=”meta”> is meta information, even though it appears in the body of the document? Or does the rel attribute just not make any sense without the href attribute?

    4. The client-side processing is a key benefit of XML. I can do all sorts of nifty things with an XHTML table that you haven’t thought of; pushing the PNG image first denies me those possibilities. This concern of mine is moot if I can “dig deeper” and access the raw table, but as you say, the spec is (at best) unclear on this.

  12. Actually, the cite attribute contains a URI, not just a URL. Well, I haven’t a clue what that means in practice, but URIs are supposed to be be able to indicate resources that aren’t just web addresses.

    If you want to include a plain-English annotation, that’s what the title attribute is for.

  13. Regarding the requirement in HTML 4 for the <tbody> element.

    From the spec:

    <!ELEMENT TBODY O O (TR)+ — table body –>
    Start tag: optional, End tag: optional

    As the start tag is option, it is implied in any place where a <tbody> element is needed.

    So in this example:

    <table>
    <tr><td>foo</td><td>bar</td></tr>
    <tr><td>baz</td><td>bop</td></tr>
    </table>

    The browser hits the first <tr> and thinks – “Oh! Its a table row. Hang on, I haven’t created a tbody, thead, or tfoot yet! OK, create a tbody element to put the row in. When it hits the section </table> a simplar process occurs. “Oh! The end of a table! I’d better close the <tbody> element first!

    This is the same as </p> and </li> being optional. There are conditions which require the element to end before the next bit of code can be interpreted and is rather an unholy mess.

    XHTML 1.0 went a long way in deal with this by requiring well formed markup (and disallowing optional end tags in the process). This appears to be taking it a step further and forbidding (at least some) optional start tags.

  14. In HTML 4, the spec says:

    * The TBODY start tag is always required except when the table contains only one table body and no table head or foot sections. The TBODY end tag may always be safely omitted.
    * The start tags for THEAD and TFOOT are required when the table head and foot sections are present respectively, but the corresponding end tags may always be safely omitted.

    So, no, if you just have a single block of table body, and no headers or footers, then you don’t need a <tbody>

    I though that in XHTML 1, the same was true, except for the bit about omitting the ending tags.

Comments are closed.