Matthew Mullenweg writes that aside from his Photolog, his site passes all three tests in the infamous XHTML 100 survey. It does indeed. I now count one, two, now three sites that pass. Four, if you count the W3C’s own XHTML pages. Goodness, the results are getting less horrific by the minute! So, Matt — like Jacques Distler and Beandizzy before you, I salute your diligence. Matt also jokingly suggests that I start up a webring of invalid sites. To that I respond: a webring that constitutes nearly the entire World Wide Web has to be overkill at the very least.
In the meantime, Phil Ringnalda has posted a nice summary of recent XHTML-related events. Phil’s comments on XHTML MIME-types is basically the following: he knows quite well that
text/html has been given the W3C‘s official stamp of glaring disapproval (a “SHOULD NOT“). However, he still chooses to serve up
text/html to all browsers for a couple of reasons, which I will summarize and hopefully not mangle:
- He’s scraping his own site, and the parsing is easier if he uses valid XML.
- If he or someone else posts anything to his site that is invalid, XML-aware browsers will choke.
Therefore, it makes no sense for him to use
application/xhtml+xml, because that MIME-type provides no benefit given what he’s actually choosing to do with his XHTML.1
The first part is interesting — I hadn’t considered this possibility. Clearly the oft-touted concept that “XHTML will allow us to use off-the-shelf XML parsers on the web” is an idea straight out of Wolkenkuckucksheim. But Phil is restricting his parsing to a known subset of valid XHTML — his own site. Hard to argue with that.
As for the second part, the someone else is the key bit. It’s hard enough to keep your own XHTML pages valid, but when you have to clean up after other people too… ugh. I don’t have a great answer on this, although Jacques Distler might — he’s been struggling with this issue for weeks now. He’s finally compiled a big piece of the solution all in one place. If you’re an XHTML blogger, you need to read this article.
But Jacques is the type who is willing to hack into the
MTValidate plugin to make sure that it validates properly. Not I. All I can do is reiterate what I’ve been mumbling for a while now: XHTML is hard. XHTML was designed to produce code that is stricter and more machine-friendly than its humbler predecessor, HTML. The machines don’t care if the error came from you or from some random person who sent you a Trackback. Either way, the parser will choke.
I’m still “thinking out loud” on this whole XHTML issue, but one thing is becoming clear. Reformulating your site as XHTML requires a major rethinking of everything you are doing. Proceed with caution. Ask yourself why. Phil Ringnalda has a reason: self-scraping. Jacques Distler has a reason: embedding MathML. Put your shades away for a minute: what’s your rationale?
1. I’d argue that Phil is therefore using the correct MIME-type. A) He does have a reason for marking his site up as valid XHTML, but B) he doesn’t really care whether you perceive his site to have any “XML-ness”.