I Think We Can All Be Proud

One of the problems with the XHTML 100 Laziness Test was that it was… well, lazy. Rather than simply validating three random secondary pages, a real Laziness Test would spider through all pages. I didn’t bother with this approach for two reasons. First, my version of the Laziness Test was reasonably effective at weeding people out (although not perfect). Second, writing a well-behaved spider is well beyond my rather limited programming skills.

Fortunately, all is not lost. We can build such a spider. We have the technology. Or rather, the Norwegians do. (And isn’t that scary? What else are those Norwegians hiding from us in those fjords? Does the Defense Department know about this?) Anywaay. Via Zeldman, via Ben Meadowcroft, via… no, wait, that’s all the vias. Err… via all those people, researcher Dagfinn Parnas reports that 99.29% of the 2.4 million web pages in his sample fail validation.

In truth, Parnas did a much better job measuring general standards compliance than the XHTML 100, which was just a quick-and-dirty survey. Moreover, you can’t compare Parnas’s data directly with the XHTML 100 results. First of all, I was looking at the Alpha Geeks And Their Friends, while Parnas is looking at a much larger and much less geeky population. Second, Parnas is looking at HTML, while I was looking at XHTML only. Finally, Parnas’s analysis is fundamentally different in that he aggregates his data into one pool. For example, consider a site in the XHTML 100 where the first page validates, the first two secondary pages validate, but the last secondary page fails. Parnas would count that as three successes and one failure. I would count that as total success for Test #1 and total failure for Test #2.

But hey — let’s ignore all those distinctions. If you can’t compare apples and oranges on the Internet, where can you compare them? Parnas reports a 99% failure rate. In contrast, I report an Markup Geek failure rate (as measured by Test #2) of a staggeringly low 90%. I think we can all be proud.

If you do bother to read Parnas’s nearly 6MB pdf paper, and I certainly recommend that you do, be sure to look at the breakdown of the various types of errors. The bulk of the errors (after “no DTD declared”) consist of “non-valid attribute specified” or “required attribute not specified”. Not surprising at all. From my own experience, very few people seem to know that the alt tag is required or that <img border="0"> is illegal in XHTML 1.0 Strict. The only real puzzler in Parnas’s data was the relatively low fraction of pages with invalid entities. In the XHTML 100, invalid entities were a major killer. I don’t have a good explanation for this discrepancy, but hey.

Parnas concludes:

As we have seen there is little correlation between the official HTML standard and the de-facto standard of the WWW.

The validation done here raises the question if the HTML standard is of any use on the WWW. It seems very odd to have a standard that only 0.7% of the HTML documents adhere to…

A good question. For me, the reason to validate is not ideological. Simply put, validation saves me time. For any page design, there are a huge number of possible glitches across the various browsers. Validation doesn’t reduce the set to zero, but it does make the set a lot smaller. Hey, I don’t know about you, but I need all the help I can get.

Look Ma, Another XHTML Post

The X-Philes grow ever larger. It’s become clear that I need to make some clarifications in how the tests work — that I have to state a couple of things explicitly, and that I might need to tighten up the rules to close some loopholes. In the meantime, we muddle through, as we must. Of particular interest in the latest batch: James Graham, a UK physics student who is embedding MathML in certain pages on his site. (Click to advance through the slides on the MathML page.) Also we must give due respect to St. Raphael Academy, which might very well be the most techologically advanced secondary school website ever. Certainly it beats the pants off of my alma mater’s site, which has lovely bordered frames, scrolling headline text courtesy of the <marquee> tag, and no DOCTYPE, presumably to spare us all from the horror. Of course it is a cash-starved California public school. I blame Gray Davis personally.1

I’ve also added a “Useful Reading” list to the Markup section. This list constitutes a small fraction of the external articles and documents that have been instrumental in shaping my thinking on XHTML and web standards. Of course I can’t begin to list all the articles I’ve read, nor can I provide access to all the personal emails and comments I’ve received of late. My thinking on this issue is still evolving. Still pulling things together. More to come.

1. Apparently so do others. Some people take their school’s website design awfully seriously.

The X-Philes

Whoa, a cluster bomb? I hope not! Sounds dangerous.

Well, it has been rather busy around here. I’ve decided to collect all posts that are even vaguely markup-related and display them in a central repository. I’ve also included a list of sites that pass the XHTML 100 test suite. Again, we’re only testing validation and MIME-types. I’m purposefully ignoring Test #4, the “Why Are You Doing This” Test. You could be one of those rarified individuals that has actual technical reasons for using XHTML. Or you could be doing it for “softer” reasons: for political advocacy, as a personal learning experience, or simply to prove to yourself that you can do it. It’s all fine as far as I’m concerned.

Note that I tried to add the W3C Markup pages to the list, but failed. The main page validates as XHTML 1.0 Strict and provides the proper MIME-type to Mozilla. However, the second link I happened to grab is valid but serves up text/html. Ditto for the validator.1

The only downside is that on our sidebar we have to say goodbye to guest-blogger Byron Kubert. Byron’s adventures in Norwegian Viking School were gripping, but now he’s back in the States, and he hasn’t posted in months. He’ll still be accessible from the front page, though.2

Final note: I’d like to offer particular congratulations to stalwart young U.K. computer scientist Thomas Pike and his comrade and countryman, Thomas Hurst. Both of them serve up their pages as XHTML 1.1 to browsers that accept application/xhtml+xml and HTML 4.01 Strict to browsers that don’t — tags and everything. Now that, my friends, is real content negotiation. Gentlemen, I salute you.

1. On the plus side, you can validate the validation of the validator. What fun!

2. It’s a good thing Byron spent more time sailing ships and less time learning how to cleave skulls with an axe, or else I’d be a little worried about demoting him.

Put Your Shades Away for a Minute

Matthew Mullenweg writes that aside from his Photolog, his site passes all three tests in the infamous XHTML 100 survey. It does indeed. I now count one, two, now three sites that pass. Four, if you count the W3C’s own XHTML pages. Goodness, the results are getting less horrific by the minute! So, Matt — like Jacques Distler and Beandizzy before you, I salute your diligence. Matt also jokingly suggests that I start up a webring of invalid sites. To that I respond: a webring that constitutes nearly the entire World Wide Web has to be overkill at the very least.

In the meantime, Phil Ringnalda has posted a nice summary of recent XHTML-related events. Phil’s comments on XHTML MIME-types is basically the following: he knows quite well that text/html has been given the W3C‘s official stamp of glaring disapproval (a “SHOULD NOT“). However, he still chooses to serve up text/html to all browsers for a couple of reasons, which I will summarize and hopefully not mangle:

  1. He’s scraping his own site, and the parsing is easier if he uses valid XML.
  2. If he or someone else posts anything to his site that is invalid, XML-aware browsers will choke.

Therefore, it makes no sense for him to use application/xhtml+xml, because that MIME-type provides no benefit given what he’s actually choosing to do with his XHTML.1

The first part is interesting — I hadn’t considered this possibility. Clearly the oft-touted concept that “XHTML will allow us to use off-the-shelf XML parsers on the web” is an idea straight out of Wolkenkuckucksheim. But Phil is restricting his parsing to a known subset of valid XHTML — his own site. Hard to argue with that.

As for the second part, the someone else is the key bit. It’s hard enough to keep your own XHTML pages valid, but when you have to clean up after other people too… ugh. I don’t have a great answer on this, although Jacques Distler might — he’s been struggling with this issue for weeks now. He’s finally compiled a big piece of the solution all in one place. If you’re an XHTML blogger, you need to read this article.

But Jacques is the type who is willing to hack into the MTValidate plugin to make sure that it validates properly. Not I. All I can do is reiterate what I’ve been mumbling for a while now: XHTML is hard. XHTML was designed to produce code that is stricter and more machine-friendly than its humbler predecessor, HTML. The machines don’t care if the error came from you or from some random person who sent you a Trackback. Either way, the parser will choke.

I’m still “thinking out loud” on this whole XHTML issue, but one thing is becoming clear. Reformulating your site as XHTML requires a major rethinking of everything you are doing. Proceed with caution. Ask yourself why. Phil Ringnalda has a reason: self-scraping. Jacques Distler has a reason: embedding MathML. Put your shades away for a minute: what’s your rationale?

1. I’d argue that Phil is therefore using the correct MIME-type. A) He does have a reason for marking his site up as valid XHTML, but B) he doesn’t really care whether you perceive his site to have any “XML-ness”.

The XHTML 100

In the spirit of Marko Karppinen’s The State of the Validation, here are the results of testing 119 XHTML sites for standards compliance. This is not a rigorous scientific exercise; the methodology had several shortcomings, some of which I detail below.

Most of the sites tested are the personal web pages of the “Alpha Geeks”, an elite group of well-linked web designers and programmers, along with some of their friends. Because these are individuals, I do not plan to “name names” by publishing the exact list of URLs tested. Sorry. However, the general sample group is pretty easy to reconstruct. If you’re the type of person who is interested in XHTML — if you’re the type of person who would waste time reading the rest of this post — just look at your own blogroll, and start validating. Your results should be roughly the same as mine.

This post is divided into three sections:

Test Description

The tests derive from the first three criteria described in an earlier entry. I only tested sites that claimed to be XHTML — in other words, I only validated sites that provided an XHTML DOCTYPE (or something that was trying to be an XHTML DOCTYPE, anyway.) I ignored sites that provided an HTML DOCTYPE or that didn’t have a DOCTYPE at all. It would have been interesting to test HTML 4.01 standards compliance, but that wasn’t what I was interested in.

The “fourth” test described in the earlier entry gets at the question of, “Why are you using XHTML in the first place?” I think this is a good question to ponder… but for this survey I thought it best to focus on the first three tests, which are less philosophical and more straightforward and mechanical.

For the sake of brevity, as soon as a site failed, I stopped applying all further tests. One strike, you’re out.

  • Level 1: The “main” page must validate as XHTML. (“The Simple Validation Test”)

    This test is self-explanatory. Select the home page of the site and run it through the W3C validator. Note that in many cases the page I tested was not a top-level page, but a main journal/weblog page, as in http://domain.com/blog. The distinction doesn’t matter too much. We just want to validate the main entry point to the site… or the page that outsiders tend to link to in their blogrolls, anyway.

    The great majority of XHTML sites failed to pass Level 1.

  • Level 2: Three secondary pages must validate as XHTML. (“The Laziness Test”)

    I designed this test to weed out people who go to the effort to make sure their home page validates… and then simply slap an XHTML DOCTYPE on the top of the rest of their pages and call it quits.

    A “secondary” page is simply another page on the website that is only one or two clicks away from the main page, such as an “About” page, a “Contact” page, or a monthly archive page. These secondary pages often had images, forms, or other elements that were not present on the main page, thus providing a useful test of proper tag nesting and valid attribute usage. If the secondary page lacked an XHTML DOCTYPE I skipped it; if it had an XHTML doctype, it was fair game.

    Of course, a more thorough test would validate all pages on the site and then characterize the total results (somehow). I chose to validate just three pages. Basically, I figure that if I can quickly select three other pages that all validate, then you’ve done a pretty good job of making sure that your site is in solid shape. Of course, some people will pass this test based on the luck of the draw, and so clearly this test overestimates the number of people who have “perfectly valid” sites. Hey, I’m okay with that.

    The majority of XHTML sites that passed Level 1 failed to pass Level 2.

  • Level 3: The site must serve up the proper MIME-type (application/xhtml+xml) to conforming user agents. (“The MIME-type Test”)

    The “conforming user agent” I used to sniff for the MIME-type was Mozilla 1.3. Mozilla has been around long enough that its ability to handle application/xhtml+xml should be well-known. Furthermore, Mozilla indicates that it can handle this the proper MIME-type through its HTTP-ACCEPT header. If the site served up text/html to other browsers, that was fine — I was just looking for some acknowledgment of this issue.

    If an author makes it past Test 2, he or she clearly knows a thing or two about XHTML. If he or she then fails Test 3, we can conclude one of two things:

    • The author is ignorant of the spec.
    • The author is willfully ignoring the spec.

    Either way, it’s a failure. XHTML is not simply about making sure all your tags are closed and your attributes are quoted. XHTML might look superficially like HTML, but it is an entirely different beast. Those who know enough to pass Test 2 should know enough to understand the MIME-type as well.

    Anyway, the great majority of XHTML sites that passed Level 2 failed to pass Level 3.

The reasons why you should serve up your XHTML as application/xhtml+xml are well-documented. First and foremost, the spec says so:

The ‘application/xhtml+xml’ media type [ RFC3236 ] is the [emphasis not mine] media type for XHTML Family document types, and in particular it is suitable for XHTML Host Language document types….

‘application/xhtml+xml’ SHOULD be used for serving XHTML documents to XHTML user agents. Authors who wish to support both XHTML and HTML user agents MAY utilize content negotiation by serving HTML documents as ‘text/html’ and XHTML documents as ‘application/xhtml+xml’.

Second, there’s Hixie’s famous article on the matter, which describes why you need to use the proper MIME-type. Personally, I think Hixie is a little too strict. He argues strenuously that serving up XHTML as text/html is wrong, and then relegates to Appendix B the concept of serving up different MIME-types to different user agents: “Some advanced authors are able to send back XHTML as application/xhtml+xml to UAs that support it, and as text/html to legacy UAs…” (A side note: this distinction about “advanced” authors is a little odd. First, as the results demonstrate, XHTML is hard enough that even advanced authors get it wrong most of the time. Second, configuring your server to do some minimal MIME-type negotiation really isn’t that tough. If you’re advanced enough to know what XHTML is, you’re advanced enough to add a few lines to your .htaccess file. Or add a little PHP snippet for your dynamic pages. Et cetera.)

Anyway, without Hixie’s Appendix B, we’re stuck. If you serve up your pages as application/xhtml+xml to all browsers, you’ll run into IE, which chokes on this MIME-type. The only non-suicidal thing to do is to serve text/html to the primitive browsers that don’t understand the proper MIME-type, and application/xhtml+xml to the ones that do.

Data Collection

I collected results for 119 XHTML websites. I reviewed about half the sites on April 19, 2003, and the other half on April 20, 2003. I used Mozilla 1.3 to sniff for MIME-types, but for the majority of my testing I used Safari Beta 2, because of its superior speed and tab management. (A side note: for beta software, Safari performed extremely well, humming along smoothly with fifteen or twenty tabs open at once. It did consistently crash on a couple of URLs, which I plan to submit with the bug reporting tool.)

Finding 119 XHTML websites is not quite as easy as it first appears. At first I tried searching Google for terms such as “XHTML standards” or “XHTML DOCTYPE”. But as it turned out, sites that talk about XHTML standards and DOCTYPEs are suprisingly unlikely to be XHTML sites.

I finally hit upon a method that yielded a reasonable percentage of XHTML websites. I went to the blogs of several very well-known bloggers who write about web standards: the “Alpha Geeks”. I then methodically went through their blogrolls. Some observations:

  • This method is likely to overestimate the number of valid XHTML sites. The Alpha Geeks and their friends are among the most tech-savvy people publishing on the web — and furthermore, they have the enormous freedom to tailor their site so that it validates. (Large corporations are for various reasons much more sluggish.)

  • The blogrolls of the Alpha Geeks consisted primarily of fellow Alpha Geeks. There were other sites, of course — news sites, journalist-bloggers, music aficionado-bloggers, bloggers who drive traffic by posting pictures of themselves in their underwear, and so on. But the majority of the links were web standards advocates, web designers, and programmers.

  • Even in this elite crowd, a large percentage of people either didn’t bother with DOCTYPEs or were using HTML DOCTYPEs. I didn’t spend time validating the latter, although it would have been an interesting exercise.

  • A significant fraction of the Alpha Geeks were the so-called “Microsoft bloggers”. Microsoft is doing a pretty good job of getting its employees out there in the Alpha Geek community. Interestingly, nearly all the Microsoft bloggers are using HTML DOCTYPEs. Do they know something the rest of us don’t?

  • One of the more popular blogging tools of the Alpha Geeks was Moveable Type. The majority of Alpha Geek MT sites were not using MT’s default templates — usually their MT installation was highly customized. Radio was also a popular choice, although Radio blogs did not contribute significantly to the number of XHTML sites. A few of the Alphas “roll their own” system (more power to them). Blogger was suprisingly rare, considering its popularity in general — perhaps because it isn’t as customizable as Moveable Type. The ground-breaking (but now unsupported) Greymatter was even rarer.

  • Of the XHTML sites, XHTML 1.0 Transitional was the most popular choice by a wide margin. This isn’t too surprising. XHTML 1.0 Transitional is the default DOCTYPE for Moveable Type, and it has the added benefit of allowing you to use all sorts of wonderfully semantic tags and attributes such as the <center> tag and the border attribute for images.

Many Alpha Geeks (including some vociferous standards advocates) failed validation very badly, with dozens and dozens of errors of varying types. On the other hand, a few Alpha Geeks came tantalizingly, frustratingly close to validation. Typically this sort of failure would arise on the last page, where the author would make a tiny error such as forgetting to escape a few entities or inserting naked text inside a blockquote. I can certainly understand how these kinds of errors can creep in, no matter how diligently you try to avoid them. (And I can sympathize — the blockquote validation error is a personal bugbear of mine.)

But it doesn’t matter whether I feel bad or not. It doesn’t matter if I think the errors are “small” or “forgivable”. That has absolutely nothing to do with the specs, or the validator…

“Listen! And understand! That Validator is out there. It can’t be bargained with! It can’t be reasoned with! It doesn’t feel pity, or remorse, or fear. And it absolutely will not stop, EVER… until you are validated!”

And, umm, on that note, let’s get to the results.

Results

Of the 119 XHTML sites tested:

  • 88 sites (74%) failed Test 1 (“Simple Site Validation”).
  • 18 sites (15%) passed Test 1, but failed Test 2 (“The Laziness Test”).
  • 12 sites (10%) passed Test 2, but failed Test 3 (“The MIME-type Test”).
  • Leaving us with one site (1%) that managed to pass all three tests.

I know I promised not to name names, but I must make an exception. For the one man in the entire set who passed all three tests, let’s hear it for… beandizzy! Yay beandizzy! At the time of this writing, beandizzy is reformulating his design — but as of a week ago, his site validated perfectly and served up the right MIME-type. So congratulations, beandizzy. You have beaten the elite of the elite. You stand alone on the mountain top. (Well, there might be the occasional string theorist standing alongside you — but really, physicists are best ignored.)

As for the rest, the results speak for themselves. Even among the elite of the elite, the savviest of the savvy, adherence to standards is pretty low. Note that this survey most likely overestimates adherence to XHTML standards, since you would expect the Alpha Geeks to rate high on XHTML standards comprehension.

Also, I have to admit that I grew rather emotionally invested in the test process. I figured twenty sites would be enough to get at least one compliant site. When that failed, I went on to 40, 60, … amazed that not one site had passed. By the time I reached beandizzy’s site (#98) I was pretty drained. I surveyed the rest of the blogroll I was on and then gave up. So again, this survey most likely overestimates XHTML standards adherence, because I quit soon after I got one success.

Conclusions are forthcoming. But there’s one thing that’s clear right off the bat: XHTML is pretty damn hard. If the Alpha Geeks can’t get it right, who can?

Reload Assiduously

No, no, no. The plan all along was to secede from Southern California, not to secede from the entire country. Sheesh. Of course, some folks seem to think they’ve already seceded

So I don’t read the Volokh Conspiracy, but my old friend and personal attorney1 Eric Stenberg writes to inform me about an interesting article on free speech and violent video games by Prof. Eugene Volokh. The article includes a lengthy excerpt of an opinion by Judge Posner of the Seventh Circuit.

Eric says that in law school he and his classmates read a great deal of Posner, and that he (Judge Posner, not Eric) is considered one of the most important legal scholars in the United States today. Judge2 for yourself:

Zombies are supernatural beings, therefore difficult to kill. Repeated shots are necessary to stop them as they rush headlong toward the player. He must not only be alert to the appearance of zombies from any quarter; he must be assiduous about reloading his gun periodically, lest he be overwhelmed by the rush of the zombies when his gun is empty.

And all this time I thought law school would be a drag. Eric baby, where do I sign up?

Finally: I want to apologize to my friends and family for all the geek stuff in the last couple of posts. I’ve recently been quite curious about the status of XHTML on the current World Wide Web. In fact, I spent a fair chunk of Sunday and Monday evening scouring the web for XHTML websites and subjecting them to the validation tests described earlier.

The results were, shall we say, not pretty. But I’m still collecting my thoughts on the matter. Assiduously reloading my shotgun, as it were. So just fair warning: there’s going to be more on this forthcoming. Probably a lot more. I know that you, my loved ones, couldn’t care less about web standards, so please bear with me for now. It seems we’re all going to have to suffer together.

1. Well, the guy I tend to pester with legal questions, anyway.

2. Tee-hee!

We Have a Winner!

So the XHTML2 people have judged poor little old HTML and found it wanting. XHTML1 isn’t good enough for them either, because it’s really just a reformulation of HTML, and neither language is “semantic enough”. The Semantic Web is not here yet, but through the technological solution of XHTML2, we’re going to make it happen.

Given that the plan is to make the web a better place by pushing out a “better” markup language, it behooves us to ask: how are we doing on current standards compliance? No, really, stop chuckling, it’s a serious question. XHTML 1.0 has been out for over three years, and the percentage of sites that conform to the XHTML specification is not noticeably different from zero.1 A naive person might wonder whether this failure of XHTML 1 to gain traction could pose a problem.

Then — a revelation. Through rather mysterious circumstances, I found myself at the website of Jacques Distler. After browsing his site for a while, I have come to the inescapable conclusion that Jacques Distler may well be the only person on the planet who understands the XHTML 1 specification and uses it properly.

That is, Distler’s site:

  1. Passes the “Simple Validation Test”: The home page validates as XHTML. (Most sites with a DOCTYPE of XHTML or HTML fail this test.)

  2. Passes the “Laziness Test”: In this test, we try validating two or three random pages deeper in the site. (This test is meant to filter out people who lazily slap a DOCTYPE on all their pages but only bother to actually validate their home page.)

  3. Passes the “MIME-type Test”: The server is serving up the pages with a MIME-Type of application/xhtml+xml, in accordance with the specification.2 (All together, now: “Serving up XHTML as text/html is evil.” Good.)

  4. Passes the (somewhat ad-hoc) “Cluefulness Test”: Distler is actually using XHTML to do something that you can’t do using plain old HTML. Namely, he’s embedding valid MathML markup in order to display equations directly in his blog. (If you’re going to go to all the trouble of getting your site to validate as XHTML, you really ought to do something with it, yes?)

So there you have it folks. One man is knowledgeable enough to implement our latest web standards correctly. The fact that this particular man is a theoretical physicist specializing in string theory and quantum field theory should not dissuade us in the slightest that the technological solution that the W3C has chosen will ultimately prevail.

1. In fact, after some preliminary investigation on my own, I began to worry that the absolute number of sites that conform to the XHTML specification is, in fact, exactly zero. More on this later.

2. Note that Distler’s site comes awfully close to failing the MIME-type test. Distler is serving up his pages as application/xhtml-xml to all browsers that support it (such as Mozilla) and text/html to browsers that don’t, such as Internet Explorer. According to the specification, XHTML 1.0 documents “MAY” under certain circumstances be served up as text/html, but XHTML 1.1 documents “SHOULD NOT”. To wit: “In particular, ‘text/html’ is NOT suitable for XHTML Family document types that adds elements and attributes from foreign namespaces, such as XHTML+MathML.” But we’ll give him a pass on this one.

3. Note that serving application/xhtml-xml to some browsers and text/html to other browsers is the only realistic way to follow the specification as of this date. Serving text/html to all user agents is plainly wrong. How about serving up application/xhtml-xml in all cases? While admirably purist, this approach is suicidal, because IE6 chokes on that MIME-type.

Posted in Web

Well, not ALL standards are crap

I realize that in my last entry, I never really answered the opening question, “Are Standards Crap?” If you couple my failure to answer the question with my disparaging remarks about forward compatibility, standards, and XHTML 2… you might come to the conclusion that I think standards are crap.

Well, I won’t offer a non-apology apology — I take full responsibility for my lack of clarity. So here’s what I really think. Some standards are crap, and some are fine, or even good. HTML 4.01 has its quirks, but on balance it’s a fine standard. CSS 1 and 2 are excellent, and I’m eagerly looking forward to version 3. XHTML 1, I’m less thrilled about. Still debating this one internally.1

And as for XHTML 2? Well, let’s have no ambiguity here. Check out this gem from the public www-html lists:

From: “Ernest Cline”
To: www-html@w3.org, www-html-editor@w3.org
Date: Wed, 09 Apr 2003 20:31:25 -0400
Subject: [XHTML2] Poor little old <a>

Let’s face it. There is very little purpose that <a> serves in the
current working draft. About the only thing it still has going for it
is that links specified by <a> are still supposed to look like links…

The thread continues with a chorus of yeas.2. Not one dissenting voice. Heck, why keep the poor little old <a> tag, anyway? Oh, I dunno. How about because the <a> tag has been the primary linking mechanism in HTML ever since the the very first version of the language, you blithering…

Arrrrgh. Sorry about that. It just irks me to no end that the XHTML 2 folks are not only purposefully breaking backward compatibility (which is bad enough)… but they seem to be taking a gleeful pleasure in the destruction of the old. What gives? Who let these people in, anyway? Who gave these people the keys to the vault, and why aren’t the soldiers guarding the entrance?

This would all be less troublesome if only the XHTML 2 people would listen to Zeldman’s comments from a few months ago and name their specification something else. “Advanced Markup Language”, or “Semantic Markup Language”, or something like that.3 Then we could proceed with good conscience.

But instead, here’s what’s going to happen. In less than a year, the XHTML 2 people are going to push their vision of the Semantic Web out on us. Thousands of web designers are going to jump on the bandwagon and unthinkingly start slapping XHTML 2 DOCTYPES on their websites. The browser makers will ignore the new standard. Or mis-implement them. Or divert valuable resources that should be used to improve the current standards. And thus, in five or six years we’ll have a mismash of incompatible browsers and tag-soup pages pretending to be XHTML 2. By that time, the designers of XHTML 2 (by then XHTML 2.2) will be bored and champing at the bit to unleash XHTML 3 on the world. Wheeee.

What a train wreck.

1. Right now I’m thinking XHTML 1 qualifies as “Mostly Harmless”.

2. The thread continues with a digression into whether one could keep the <a> tag around if it was used for nesting links like the (currently useless) <object> tag, along with an off-topic discussion of whether to dump the <acronym> tag.

3. Of course those are bad names too, because they imply that HTML isn’t advanced and isn’t semantic. But you get the idea.

Posted in Web

Are Standards Crap?

Mark Pilgrim thinks so — or at least if he doesn’t think so, a few months ago he was frustrated enough to say so out loud. More recently, Peter-Paul Koch argues that web standards and forward compatibility are not one and the same thing.

Koch’s argument confuses Jeffrey Zeldman, author of the forthcoming Forward Compatibility: Designing & Building With Standards. This confusion is perhaps understandable, given Koch’s rather contorted English constructions. So here’s my take on it. Forward compatibility means, “will my website of today perform well in the browsers and devices of tomorrow?” Koch is asking, “If I follow web standards, will I achieve forward compatibility?” Koch’s answer is a resounding no.

People who believe the answer is “yes” are assuming that the browser-makers will implement web standards properly. But time and time again, the browser-makers have failed to do this. Ask Mark about the OBJECT element, or XHTML Basic. Better yet, ask Anil Dash. He’ll give you an earful.

Presumably the “yes” people fall into one of two camps:

  1. The strict interpretation: Web Standards = Forward Compatibility. Since the strict interpretation is contraindicated by vast reams of physical evidence from the real world, this one is safely ignored. We can only marvel at the stubborn persistence of this belief, which serves as definitive proof of the power of articles in glossy magazines and hip web publications to bend and warp reality.

  2. The loose interpretation (the “faithful” interpretation): Web Standards != Forward Compatibility… but they will at some unspecified point in the future. This belief at least takes the current situation into account, and couples it with a sweet, childlike faith that things will be better in the future.

    That is to say, despite all historical evidence to the contrary, and despite the almost total absence of market forces that would push things in this direction, all the browser-makers will eventually realize the error of their ways and repent. And yea, they shall smite the wicked users of table-based layouts with brimstone and fiery ash, and the righteous shall be redeemed. By their XHTML2 DOCTYPEs shall ye know them.

Meanwhile, Zeldman has a sensible, practical take on XHTML2:

Regardless of the 2 in its name, XHTML 2 will not make XHTML 1 obsolete. Browsers will not stop supporting XHTML 1. Designers will never have to use XHTML 2. Those who find it beneficial will adopt it. Those who don’t, won’t.

That many designers might never use the emerging specification does not seem to bother most of the framers of XHTML 2, nor does it seem to make them question the value or practicality of what they are creating.

If one of the driving forces behind the Web Standards Project can say such things about the latest, greatest version of HTML, then it behooves us all to be wary. So I’ve started reading the public email lists for XHTML2. I’m still formulating my thoughts on XHTML2, but my initial take is that it is a real chamber of horrors. I’d like to write more, but I must cool my boiling blood and be off for Passover Seder.

More to come. Happy Pesach.

Posted in Web

Mythology

This year I decided to keep my birthday low-key.

Oh sure, it would have been fun to throw yet another orgiastic, decadent, cocaine-fuelled party that would have put Studio 54 in its heyday to shame. But at some point, you have to look back on the trail of broken hearts, the trashed hotel room suites, the days-long drug-induced blackouts, and the five illegitimate children scattered around the Western hemisphere (with a sixth on the way) and say damnit, enough is enough!

So low-key it was. I even tried to discourage gifts. “I’m only accepting socks this year,” I said. I thought that was pretty clever, but I failed to scare off Shauna, who went right out and got me some really nice socks. They’re black, non-dressy, and very comfortable — neither scratchy nor sweat-inducing. And of course, my teenaged sister Sarah rebelled immediately. “You sound just like Dad!” she said, exasperated. Sarah got me six wine glasses, which is excellent, because now I don’t have to serve wine to my guests in coffee mugs. The House of Goer is class all the way, baby! Finally, Mom got me Joe Clark’s Building Accessible Websites. I’ve already read most of it, and let me tell you, it is a fabulous book, probably up there with Chuck Musciano and Bill Kennedy’s opus. Maybe better.

I like Clark’s book for two reasons. First, he is pleasantly direct:

If you use the HTML Tidy authoring tool-cum-validator from the W3C, you’ll be stuck with error messages for every layout table you write that lacks a summary attribute, whose majesty will be fully revealed in mere moments. If that happens to you, adding summary="" to your table is legal and will shut the validator up.

Second, there are numerous myths in the web design community regarding web accessibility, and Clark wastes no time in puncturing them. For example, it seems that the difference between Clark and those who blithely advise that “table-based layouts are useless for blind people using screen readers,” is that Clark actually seems to have tried using screen readers. (And surprise! they handle most table-based layouts just fine.)

Of course that particular myth never made much sense anyway. After all, it’s almost impossible to find a major commercial site that doesn’t use table-based layouts. So what kind of silly software company would try selling a “voice browser” that chokes on nearly every single commercial site on the web? I think the reason this myth is so popular is because of its “clubbability”. If you’re having an argument with someone over whether to use a CSS-based layout over a table-based layout, the screen reader myth is great for clubbing your opponent over the head. “Your table-based layout will make it impossible for those poor, poor blind people to read your website! You must hate blind people! Jerk!”

As Grandma Harman used to say, “whether you’re wrong or right, it’s always useful to hold the moral high ground.” Well, okay, actually she said, “Never draw to an inside straight” — advice I have foolishly ignored for years.

Posted in Web