Orthodoxies

I keep my RSS subscriptions fixed at a manageable 20; Jeffrey Zeldman‘s got a permanent spot on my list. Zeldman is a superb designer and a hell of a writer. I love Zeldman bunches.

And yet as often as I find his comments on web standards illuminating, occasionally I’m forcibly reminded that deep down Zeldman is a Technology Evangelist. Evangelists are great at efficiently spreading new ideas throughout the community… but the flip side is that they often have trouble departing from certain orthodoxies.

For a prime example of this, see Zeldman’s Web Design World keynote. There’s plenty of good stuff in the keynote overall, but this particular slide needs addressing:

  1. “XHTML is XML that acts like HTML in browsers.”

    Better to replace “acts like” with “is”, given that nearly all “XHTML” websites are actually parsed as HTML by all currently available browsers. (I suppose there are a rarified few sites that actually can be parsed as XML by certain highly advanced browsers, but this hardly counts as statistically significant.)

  2. “It also works as expected in most Internet devices (Palm Pilots, web phones, screen readers).”

    There are some who would dispute that. Apparently many real-world mobile devices happily support old-skool HTML cruft (table-based layouts, the <font> tag) while ignoring the products of our more enlightened era (XHTML Basic, CSS). Weird but true.

  3. “It’s as easy to learn and use as HTML.”

    Except you have to teach people about closing tags, proper nesting, encoding all your ampersands, and so on. Meanwhile, people who feel like writing tag soup HTML can just whip out a text editor and go. (Then again, considering the tremendous popularity of tag-soup XHTML on the web today, maybe this is a distinction without a difference.)

  4. “It’s the current recommended markup standard (replaces HTML 4).”

    The closest the spec comes to saying this is, “The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content’s backward and future compatibility.” That’s a fine marketing blurb, but it’s not an official announcement that HTML 4.01 is deprecated.

  5. “Because it’s XML, it works better with existing XML languages and XML-based tools (SVG, SOAP, RSS, SMIL, XSL, XSLT, databases, etc.).”

    This is pretty hand-wavy. On the server side, you can easily transform your backend data into valid XHTML, but you can also transform it into valid HTML 4.01 Strict just as easily. Heck, some people do both. On the client side, there are a tiny, tiny fraction of super-geeks who embed SVG or MathML directly in their valid and properly MIME-typed XHTML pages. These super-geeks are the only people in the world who have to present outward-facing XHTML; the rest of us are just fooling around.

  6. “It brings consistency to web documents that was missing in HTML.”

    I assume this means “more consistency in coding style.” XHTML theoretically enforces more consistency due to its more rigorous syntax… but since the vast majority of people can’t be bothered to produce valid XHTML code, this benefit is somewhat obviated.

  7. “It’s a bridge to more advanced (future) XML languages and applications and perhaps to more advanced versions of itself (XHTML 2?).”

    Heh.

The presentation then goes on to cite nine websites that are using structured XHTML and CSS, including some big names such as ESPN and Wired. None of the sites serve up their pages as application/xhtml+xml to browsers that accept this MIME-type, which means that each site is being treated as… you guessed it, good old fashioned HTML. And that’s a damn good thing, because four of the sites have invalid home pages, and four others have invalid secondary pages just one click away. The only site diligent enough to pass the “Laziness Test” is the CSS Zen Garden. (To his further credit, the creator of the Zen Garden is considering the MIME-type issues as we speak.)

Anyway. The point is not that cutting edge designs are bad; they’re not. The point is not that “Evan hates XHTML”; I don’t. XHTML allows you to do some amazing things that you can’t do with HTML. Unfortunately, due to the dreadfully primitive state of XML browsers and tools, there’s really nobody using XHTML for anything that you can’t do with HTML.[1]

The real problem is that XHTML is being touted as a replacement for HTML. It’s not. XHTML is a different technology that suits different purposes. There are a lot of influential people who are blurring this distinction, and I’d like to think that they know better.

1. Except for a few oddball physicists, mathematicians, and chemists, but who’s counting them?

17 thoughts on “Orthodoxies

  1. Zeldman seems to subconsciously equate HTML4 with Table-based layout. He’s just happy these sites are using CSS. I don’t think he knows or cares whether they are using the correct MIME-type (for instance).

    There’s another — in principle — good reason for generating XHTML: Accessibility.

    If you really were producing valid X(HT)ML, it could be processed (using XSLT, for instance) to make it more useful to blind and disabled surfers.
    Indeed, all the Accessibility sites (including the W3C) have sanctioned MathML as the *de jure* Accessible way to put math on the web.

    In principle, someone could write an XSLT stylesheet to convert MathML into spoken equations, the way ASTER does for LaTeX. In fact, all the projects to do so are moribund, and the world’s most popular screen reader *sits on top of* Internet Explorer.

    Which is to say, it doesn’t really do XHTML at all.

    On the other hand, with XHTML (and the right MIME-type), you would already have noticed the 3 control characters (from cutting and pasting quotations in the browser window) and 12 unescaped ampersands (in URLs) on this page.

  2. Good Lord, I fixed those problems last night. Then I made some subsequent minor edits, but good ol’ MT had refreshed the previous errors into the edit box. Sigh. Must remember: “reload and *then* edit, reload and *then* edit.” Maybe I *should* switch to XHTML with the right MIME type. 🙂

    I’m not exactly sure what Zeldman thinks. All I know is that he’s busily spreading the orthodox meme, “XHTML is newer than HTML, therefore it’s better.” As opposed to “XHTML is different from HTML, and has different advantages and drawbacks.”

    Anyway, your comment on accessibility and MathML is particularly apropos, as this technique will really only work with perfect MathML (optionally embedded inside perfect XHTML). This is useful functionality, but it’s for geeks, not regular mortals. ESPN and Wired won’t be doing it any time soon.

    In the meantime, for the forseeable future, screenreaders will continue to sit on top of IE and continue to handle crufty old school HTML, tables and all.

  3. re: “XHTML is newer and therefore better”

    This is why I object to calling XHTML 2 “XHTML 2”. The strongest argument for learning XHTML 2 at this point is that, at some point, some stupid HR department will hear about it and think “hey, that must be better than XHTML 1” and start requiring it on incoming resumes.

    Meanwhile, Zeldman has a point, although I’m not sure he’s consciously aware of it. “HTML” (as a brand, not a specific technology) *has* been ruined by table-based layouts, Tag Soup markup, and ultra-lax browsers. “XHTML” (as a brand, not a technology) has not been. Even though he knows it’s *technically* a lie, *market-wise* it makes good sense to push XHTML *and* clean markup (and CSS, and accessbility, and and and) at the same time. If people associate “XHTML” with all that stuff, that will outweigh the technical fact that you’re not doing anything but adding some extra slashes and angle brackets and relying on lax browsers to ignore them.

  4. And therein lies the problem.

    If you are going to sell XHTML “the brand” like laundry soap, it is *inevitable* that web designers will be left with the impression that XHTML2 is the new, improved version which will get their clothes whiter.

    It’s not, and it won’t. But if you’ve been busy obfuscating the “XHTML is XML” part of the message, you will be at a loss to explain to them why.

  5. So here’s the thing, then. If we take the high moral ground and wait until browsers start parsing XHTML as XML, we’re going to be waiting a long time. The last two months have done nothing but prove that user agents are at a stand-still, at least when we’re considering them on a larger scale.

    Much the same as the Browser Upgrade Campaign was a covert effort to allow developers the freedom to explore CSS/XHTML/DOM in their work (Zeldman’s own words), XHTML could do the same for developers wishing to transition to XML. Waiting for every last user agent to catch up is a surefire way of ensuring obscurity and irrelevance.

    So getting to point C from point A is the trick. There has to be a point B.

    Yes, XHTML 2.0 is poorly named, and yes, the mindset inevitably involves confusion, but if we’re afraid to muddy the waters a bit right now then when can you actually see this stuff becoming practical?

    I’ve been reading Zeldman long enough to know that he gets this stuff, better than I do. DWWS touches on each of the issues you’re raising. A good keynote requires sound bites, and that’s precisely what those slides illustrate. Picking them apart point by point will inevitably raise questions, but I’m sure he’d be more than capable of answering them. Who knows, maybe the Q&A session afterwards did precisely that.

  6. Hiya Dave and Mark — nice to see you.

    First — I don’t mean for this post to be one of those obnoxious “fiskings” that are so popular in the blog world. Unfortunately the “cite-and-rebut” format often looks a lot harsher than it’s supposed to be. I like and respect Zeldman, and perhaps his presentation (or the follow-up) was indeed more nuanced. I don’t know.

    Second — I used to think XHTML1 would serve as training wheels. But the more I think about this, the less enthused I am.

    Consider what happened with this very entry. I wrote my entry, validated, corrected some validation errors. I then made a couple more edits, inadvertently bringing the validation errors back. I then went to bed. In the morning Jacques alerted me that the validation errors were back. Red-faced, I corrected them. Even though the entry was invalid for about eight hours, there was no real harm done.

    But in XML-land, I’d have been totally screwed.

    That’s the problem with XHTML-as-training-wheels: most of these early adopter sites *think* they’re “future-proof”, but they’re not. An XML website has to be 100% airtight. It’s not about adding a few angle brackets. It’s nothing short of *total revolution*. No more dashing off arbitrary markup, no more letting in comments and trackbacks and advertisements without running them through the cleanser. You need to put together an entire coherent package, and it isn’t easy — just ask Jacques. Or ask Wired and ESPN; they can’t seem to pull it off, and their budget is considerably larger than Jacques’s.

    In short, I don’t think there is a point B, and we’re leading people off the cliff if we imply otherwise.

  7. “But in XML-land, I’d have been totally screwed.”

    Au contraire! In XML-land, you would have been instantly-alerted to the problem the moment you hit the refresh button in your browser. Of course, your site would have been broken for a few minutes (for those with “real” browsers), but that’s the price of not having an off-line proofing system — which Wired and ESPN can surely afford.

    “I used to think XHTML1 would serve as training wheels.”

    Maybe not training wheels, but at least a testbed. Producing valid (X)HTML really does require different techniques than are required to produce the traditional tag-soup. (Anyone who thinks otherwise hasn’t run their web site through the validator.) So, for me, the big question was: how can I bullet-proof the process?

    I learned a lot about how to do that in the process of getting my blog working. So, for instance, unlike most weblogs, I have a comment section that actually validates.

    The junk XHTML you see out there (ie, 99% of all the XHTML being produced) could have been avoided if the authors had taken the opportunity to build-in some bullet-proofing in their content-creation process.

    With the posible exception of Simon Willison, I don’t think this has occurred to any of the “alpha geeks,” let alone to the “unwashed masses” of web designers.

    In other words, this *could* have been a “Point B.” It just hasn’t worked out that way.

  8. Evan – First point: I figured as much.

    Jacques’ comment about a staging server is particularly apt. Yeah, sure your browser pukes if your XML isn’t well-formed, but testing each page before launch is pretty standard practice. Any errors should be caught before it becomes critical.

    Both yours and Jacques’ later points boil down to fundamental problems with authoring software, it would seem to me. If the simple act of editing content or a user posting a comment breaks your validation, then your CMS isn’t doing its job. Provided you’ve gone to every reasonable effort to do otherwise.

    There’s gotta be a point B. Pure XML rendering destroys older browsers, and nobody’s going to start using it until there’s a certain critical mass. Even if 90% of today’s browsers supported it, it still wouldn’t get used. Encouraging adoption involves backwards-compatibility, and that’s what makes XHTML important. It works today. Provided you do it properly.

    I explored app/x+x on my site because I want to make that next leap to 1.1. My comments pages are the main problem right now, but I will fix them. If I go to the trouble and make sure that my headers are correct, is early adoption going to be worth it? I don’t know, but I’m willing to find out.

  9. “If the simple act of editing content or a user posting a comment breaks your validation, then your CMS isn’t doing its job.”

    Well, there you go. All three of us are using the same CMS on our blogs: MovableType. Is your copy of MT “doing its job?”

    I didn’t think so. And MT is probably “best-of-breed” when it comes to producing valid XHTML.

    I’ve done some heavy hacking to get MT to the point where I can say with reasonable confidence that every page on my blog is valid XHTML. But heavy hacking is probably what it takes, given the current primitive state of the technology.

    Evan’s contention, I think, is that most people don’t see the necessity of more sophisticated technology to really do XHTML, let alone be willing to jump in and develop it.

    That makes proselytizing XHTML a bit dangerous (“leading people off the cliff”), especially if you soft-pedal the difficulties and present it as just a shiny-new version of HTML.

  10. Point taken.

    Will it be true in a year though? We all caught ALA’s article on TypePad’s standards compliance. So if MT 2.8 or 3.0 ships generating XTHML 1.0 Strict out of the box, then are we allowed to evangelize XHTML?

    I just think hey, I know better, why wait? You know where I’m coming from on this if you’ve bent over backwards to rig MT for well-formedness.

    And if I know better, other developers can be convinced of the same.

  11. Jacques — yep, that’s what I’m saying. The percentage of *serious markup geeks* who get this seems to be vanishingly small.

    Dave, I have no doubt that you, like Jacques, will be able to hack together a bulletproof XHTML system using MT. I heartily encourage you to do so. The trick is that you and Jaques are both “edge cases” — there won’t be any serious adoption until we get MT 3.0, which generates Bulletproof XHTML.

    And of course, MT 3.0 won’t ever generate Bulletproof XHTML unless the community *understands* that this feature is needed — that you can’t just slap an XHTML doctype at the top of the page and call it a day. So I’m hoping that once you complete your project, you’ll use your blog to educate people on this issue, and explain what works and what doesn’t. I’m counting on you, ’cause your soapbox is a lot louder than mine :).

  12. “I just think hey, I know better, why wait?”

    Please don’t construe my comments as intended to discourage you. Quite the contrary! If I’m going to live on the bleeding edge, there’s nothing I’d like more than some company. I’m just pointing out that it *is* bleeding edge. I hope you enjoy the challenge.

    I’ll be looking forward to hearing of your progress. I’m sure your write-up will be more lucid than my feeble attempts to document mine.

  13. Well dammitall, I guess I’m committed now. 😉

    I’ll be heavily perusing your write-ups, Jacques. I’m really not a programmer in any usage of the term, so this will be a challenge for me. I suspect I’ll take the easy way out and disallow HTML formatting in the comments, but we shall see.

    I don’t know that I should be considered an edge case, Evan. Technically I’m a graphic designer, I just happen to know my code. If a designer can do it, well, hell.

    Maybe that’s our hook: shame. 😉

  14. “I suspect I’ll take the easy way out and disallow HTML formatting in the comments, but we shall see.”

    You could do that.

    Turning off comments (entirely) and trackbacks makes the whole bullet-proofing process much simpler.

    But what have you proven thereby? “Hey, look! I can serve up this much less capable web site as application/xhtml+xml instead of my older, more fully-featured ‘tag-soup’ site.”

    That’s not exactly a ringing endorsement of the technology, is it? If anything, one would like to be able to point to the *benefits* of doing XHTML properly. Otherwise, why go to the trouble at all?

  15. Arguably, forcing commenters to validate (and to know how to achieve validation!) is just as much of a usability problem as no HTML formatting. It’s a trade-off either way, at the very least.

  16. This is one of the things that makes me wonder if MT 3.0 will ever be bulletproof.

    Let’s say that MT 3.0 builds in comment validation in their default templates. Some people will like this feature. But *a lot* of people will be pissed. “What’s with this validation nonsense? I just want people to enter comments on my blog!”

    So as a business decision, maybe it doesn’t make sense for Ben and Mena to make a tool that produces bulletproof XHTML. If that’s the case, I’d like them to switch their default templates back to HTML 4.01… but of course the howling mob would have their hides for a stunt like that.

  17. You can always hide the results of validation if it is successful. Commenters who don’t embed HTML (and don’t throw in unescaped ampersands or control characters or …) will not even notice that you’ve validated their comments.

    The issue only arises if they throw invalid stuff at you. Sometimes this is reparable with some easy fixes (and MT actually does repair some of the simplest errors, like unclosed tags). But sometimes, you have to ask them to correct their comment manually.

    I don’t know how to get around that.

Comments are closed.