In the last installment of XHTML2 Explorations, we touched on how XHTML2’s attribute collections provide a rich variety of behaviors for nearly all elements. Today’s installment focuses on the individual elements themselves. As with Part I, this post does not discuss “well-known” XHTML2 concepts.1
New Paragraph Model
XHTML2 gives us a new twist on our old friend the <p>
tag. In previous versions of HTML and XHTML, you could not nest other block elements inside a paragraph. Now you can nest any block element except for another paragraph. In other words, you can now consider a list or table to be semantically part of a paragraph.
This leads to an interesting result (first brought to my attention by Jacques Distler). Consider the following (invalid) markup2:
<p style="color: red"> Why is Gordon better for Dana than Casey? <ul> <li>Knows mayor Giuliani</li> <li>Can dress self w/o assistance from Wardrobe</li> <li>Obvious physical prowess</li> <li>Two!! post-graduate degrees</li> </ul> </p>
Since a paragraph can’t contain lists, a standards-compliant browser would end the paragraph just before the start of the unordered list. Thus the words, “Why is Gordon…” would be colored red while the list items would remain unchanged.
However, under XHTML2’s new paragraph model this code is valid, and everything inside <p>
and </p>
would be red. I have to admit that from a coding standpoint, I like this. If nothing else, it matches my naive expectations a little better. (“Hey, why isn’t my list red?” the newbie web designer wonders…)
From a semantic standpoint, I’m a little less sure about this. I usually don’t think of my tables as being nested inside my paragraphs. Neither does Framemaker, for that matter. Is Framemaker broken? Am I broken? Well, maybe. The good thing about this model is that it’s totally optional — you can nest stuff inside paragraphs, or not. Works for me.
Scripting
You can now nest <script>
elements and process them like the <object>
element. If the browser understands the parent script, execute it; otherwise go on to the child script(s). It’s a nifty model, albeit one that is not backwards compatible. Whoops — I forgot, it’s not fair for us to harp on that. Sorry. Anyway, the model itself looks good, particularly if you want to script with multiple languages. The spec does take great pains to mention that there are scripting languages besides Javascript, such as… type="text/x-perl"
. Hmmm.
There is also a new declare
attribute for both scripts and objects. This boolean attribute specifies whether the script or object is a “declaration only”, meaning that it is not to be processed until the user initiates some sort of action. And speaking of actions, there’s a brand-new events model defined by the XMLEVENTS standard (which is pleasantly short and easy to read). XMLEVENTS allows you to set any element as the observer or handler for standard DOM events. It also provides a generic <listener>
element that can pass events off to handlers. The XMLEVENTS spec looks to be far more comprehensive and flexible than our current model, which involves slathering our code with onmouseover
attributes and whatnot. The only catch is that XMLEVENTS has totally replaced the existing model, which means our current scripts all just went POOF!. Argh, there I go again…
The spec also declares the death of document.write
:
Note that because of the XML processing model, where a document is first parsed before being processed, the form of dynamic generation used in earlier versions of HTML, using
document.write
cannot be used in XHTML2. To dynamically generate content in XHTML you have to add elements to the DOM tree using DOM calls [DOM] rather than usingdocument.write
to generate text that then gets parsed.
Well, that’s fair enough. The W3C seems to be saying, “Look guys, this is XML here. Not HTML, not some halfway-step that decays back into friendly tag-soup if you make a mistake — this is the real stuff here.” I suppose the real question is whether they’re going to forbid XHTML2 to be served up as text/html
. I can’t find any discussion of MIME-type issues in the current spec, so it’ll be interesting to hear the W3C’s final decision on this.
Miscellany
-
There’s a new
<quote>
element for marking up inline quotes. Unlike its ill-fated predecessor<q>
, the<quote>
element does not automagically insert localized quotation marks. “We give up,” the W3C is saying. “Insert your own damn quotation marks.” -
Tables are relatively untouched. The one noticeable change is that the
summary
attribute is now an element, presumably to provide a facility for richer descriptions. The spec makes no recommendations for how to display this content in visual browsers. I think we can assume that the default should bedisplay: none
, but you never know.I also noticed that the spec requires tables to have one or more
<tbody>
elements… but as it turns out, this requirement dates all the way back to HTML4! I must say this comes as quite a shock, given that the validator happily accepts pages with<tbody>
-free tables as HTML 4.01 Strict. I’m looking at the spec again right now, and I’m still a bit freaked out over this. Do I not know how to read the spec? Am I hallucinating because I skipped lunch in preparation for the big Fourth of July BBQ? We report, you decide. (Edit: it turns out I can’t read the spec. Never post while under the influence of carbohydrate deprivation.) -
Finally, Ruby text is now a standard module. This is good news for hundreds of millions of Asian-language speaking users… at least, I think. Actually, this begs the question: if Ruby text is a fundamental component of Asian typography, why are we forcing Asian users to go all the way to XHTML2 to use it? It seems like it would be useful and straightforward to retrofit Ruby text onto HTML. Then again, since HTML is officially a dead specification, the point is moot.
Conclusions
Errr… who needs conclusions? It’s barbeque time. Happy Fourth!
1. My definition of “well-known” is “I remember hearing about it vaguely on someone’s blog somewhere.”
2. As for whether the pro-Gordon argument is as invalid as the markup… well, I leave that as an exercise for the reader.
Oooh eck.
I slipped up and used the wrong comment link.
See https://www.goer.org/CMS/mt-comments.cgi?entry_id=304 for my comment. (It starts “Regarding the requirement in HTML 4 for the <tbody> element.” if you want to use Find-in-this-page on your browser)
In HTML 4, the spec says:
* The TBODY start tag is always required except when the table contains only one table body and no table head or foot sections. The TBODY end tag may always be safely omitted.
* The start tags for THEAD and TFOOT are required when the table head and foot sections are present respectively, but the corresponding end tags may always be safely omitted.
So, no, if you just have a single block of table body, and no headers or footers, then you don’t need a
I though that in XHTML 1, the same was true, except for the bit about omitting the ending tags.
Whew! I was just looking at the “+” sign and reading that as a simple “one or more”.