## May 2, 2005

It sounded like a simple request: serve the MovableType Administrative Interface with the correct XHTML MIME-type, so that authors at the String Coffee Table could see the MathML when they preview their posts.

Shouldn’t be too hard, eh? After all, MovableType is supposed to be pretty hip with Standards-compliance, and we’re already serving all of the public-facing pages with the correct MIME-type.

One of the abiding illusions of web-design, is that, on a well-constructed XHTML website (and, if MovableType isn’t one, what is?), changing from text/html to the correct application/xhtml+xml should be an easy transition. Well, here was a good test case: a web application, written by hip, Standards-aware people, but which had (evidently) never been tested with an XHTML MIME-type.

How did it fare?

Almost every single screen was ill-formed. Those which weren’t, lacked the required xmlns attribute on the <html> element, and so were rendered as an XML document tree in Mozilla.

Correcting all of that, the Javascript — all of it— which provides much of the functionality in the MT Admin Interface, was broken:

1. Hidden away, unseen, in XML comments.
The content of the <script> element is CDATA in HTML, but PCDATA in XHTML. You can’t hide scripts in
<script>
<!--
...
-->
</script>
in XHTML. If you need to hide scripts in XHTML, and want them to work when served as HTML as well, you can go to extraordinary lengths and wrap them in
<script>
<!--//--><![CDATA[//><!--
...
//--><!]]>
</script>
I did that for those scripts that really needed it.
2. Used document.write.
Not allowed in XHTML. You need to use DOM-scripting.
3. Triggered by onClick="..." or onChange="..." event handlers.
Attributes are case-sensitive in XHTML. Those need to be onclick="..." and onchange="..." .

What I thought would be a few hours work, proved to be a major undertaking. My accumulated changes, eventually ran to a 1340-line unified-diff file. Thirteen hundred and forty lines1!

I made no attempt to produce valid XHTML, nor did I attempt to add any new functionality. I merely wanted something which would not produce a Yellow-Screen-of-Death and which had functional Javascript. Doing that required a patch file more than twice as long as my existing set of patches to MovableType, which add all sorts of cool new functionality.

Now, someone will surely pipe up with the fact that the WordPress Administrative Interface works just fine under application/xhtml+xml. Indeed, it does. But, then, its lead developer actually uses the correct MIME-type to run his own site.

I think the moral is clear …

#### Update (5/5/2005): It’s not a bug, it’s a feature!

Urs reported a disastrously bad bug when serving the MT Administrative Interface as application/xhtml+xml: compose and preview an entry. Then click on Re-Edit this entry. Poof! … all the line breaks in your text are turned into spaces. The same operation continues to work flawlessly when the page is served as text/html.

This drove me up a wall, till I discovered the reason for the behaviour. The text of your entry is passed as a hidden form field

<input type="hidden" name="text" value="..." />

where “…” is the text of your entry. And what does the XML Specification say should be done with linebreaks in attribute values? Why, golly, they should be turned into spaces.

As The Church Lady would say, “Isn’t that special!” Do any of the XML gurus out there know how one is supposed to submit a form with hidden textual data, in which white spaces are preserved?

#### Isn’t that semantic:

Sam Ruby comes through with the solution. Replace
<input type="hidden" name="text" value="..." />

with

<textarea style="display:none" readonly="readonly" name="text">...</textarea>

for any hidden form-input that needs to be white-space-preserving. Tell me again about how you thought XHTML was “more semantic”?

I guess that, now, I need to hunt through the MT Administrative interface for (other) instances of this. But, at least, Urs’s problem is solved.

#### Update (5/10/2005): 1399 Bottles of Beer

Well, it seems to have stabilized at 1399 1624 lines of unified diffs. For the curious, or masochistic, herewith is my patch to turn the MT Admin interface into well-formed XHTML. Don’t bother complaining that it’s still not valid XHTML. I ain’t listening. If, however, you find further instances where the MT Admin interface is not well-formed, please let me know.

These patches were extracted from my only slightly larger (1975 2200 line) patch file for MT 3.16. The latter implements all kinds of additional useful stuff, like (entry- and) comment-validation, forced comment previews, threaded comments, internationalized trackbacks, comment text filters, …

1 Even discounting the ‘context’ lines in a unified-diff, that was still 510 added/subtracted lines of code.

Posted by distler at May 2, 2005 12:09 PM

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/564

It does because I reported all the bugs I encountered with the admin interface of WordPress some time ago. (When I still used it.) I assume someone continued to check all those things after that a bit closer or so, but before that every page was ill-formed. (Matt is actually using text/html for his weblog at the moment. He gets invalid characters in his comments occasionally, et cetera.)

Posted by: Anne on May 3, 2005 3:51 AM | Permalink | Reply to this

Just an insignificant point, but Matt’s site appears quite a lot like text/html to me, or are you talking about his admin interface?

Posted by: Frenzie on May 3, 2005 3:36 PM | Permalink | Reply to this

I’m stupid. Say it. *bangs head against wall smilie*

Posted by: Frenzie on May 3, 2005 3:37 PM | Permalink | Reply to this

But, then, its lead developer actually uses the correct MIME-type to run his own site.

As far as i know, it doesn’t, at all.

Although it sends an application/xhtml+xml Content-type to Firefox his Apache server seems to be misconfigured and really sends the file with the text/html mime as you can see here. The same behavior is observed when browsing Photomatt in Opera: text/html, no xhtml mimes anywhere

### Photomatt

Sigh.

Matt used to send application/xhtml+xml. I know that for a fact (having examined the HTTP headers). As you, and Anne, and Frenzie, and doubtless others have noticed, he no longer does so.

I might guess (as Anne suggests) that this is because he is tired of fighting with invalid comments (and unable to prevent them programmatically). But Matt will have to speak for himself, if you want the real reason.

In any case, since the latest WP 1.5 continues to emit well-formed XHTML in its Administrative interface (or did, when I tested it), I have reason to assume that someone over there is still testing WP with the correct MIME-type.

Posted by: Jacques Distler on May 4, 2005 9:10 AM | Permalink | PGP Sig | Reply to this

Many thanks for going through this trouble! Being able to check the MathML output while editing an entry is extremely helpful.

One little detail: When I now use the filter ‘itex to MML with parbreaks’ while creating entries the formatting of my source code (in particular blank lines that I insert for readability) is completely undone whenever I preview the entry and then go back to editing it. So each time after I preview the comment I find my ASCII source to be rather unreadable. And in addition I have to insert <p> and </p> by hand.

I almost don’t dare to ask, given the work you already went through, but would it be hard to change that to the way it was before?

Posted by: Urs Schreiber on May 4, 2005 9:53 AM | Permalink | Reply to this

That’s very strange.

It’s not something I broke. Switching back to text/html fixes the problem, but as soon as you send the form as application/xhtml+xml, the carriage-returns get stripped (actually, turned into spaces) from the <textarea>.

And it has nothing to do with the choice of text-filter, apparently.

Could this be a bug in HTML::Template? The code that’s supposed to produce the text in the <textarea> for re-editing is

<textarea onkeypress="mtShortCuts(event)" class="full-width" name="text" id="text" tabindex="3" rows="<TMPL_IF NAME=DISP_PREFS_SHOW_EXTENDED>10<TMPL_ELSE>20</TMPL_IF>"><TMPL_VAR NAME=TEXT ESCAPE=HTML></textarea>

The ESCAPE=HTML doesn’t seem to be doing what it’s supposed to be doing.

I’m a bit perplexed. Shall we go back to text/html until we get this sorted out?

Posted by: Jacques Distler on May 4, 2005 11:58 AM | Permalink | PGP Sig | Reply to this

Shall we go back to text/html until we get this sorted out?

I can deal with both ‘bugs’ (no visisble MML on one hand or stripped CRs on the other) in one way or another.

But if you ask me I would decide to stay with serving application/xhtml+xml even in its present form. I can compile the source in a seperate editor and copy&paste it to the MT Admin interface.

Posted by: Urs Schreiber on May 4, 2005 12:48 PM | Permalink | Reply to this

Nice work, Jacques. Of course, this is a fairly unusual requirement, but it’s an important thing for us to be aware of. (Admittedly, we focus more on the output than the application pages themselves.)

I’ve made sure the team has seen your post, and we’ll see if we can’t improve the story in the future. Sorry for the difficulty, and thanks for taking the time to document it.

Posted by: Anil on May 4, 2005 5:42 PM | Permalink | Reply to this

Oh! I’d almost failed to mention. As of version 3.16, you can redistribute the application templates for Movable Type. So if you provide any of the files under the /tmpl directory in modified versions, others can place them into their own installs and make use of them.

MT also supports an additional template directory, so that if you upgrade, your changes to the admin UI don’t get blown away. Good stuff, and while it’s not as nice as if the problem weren’t there in the first place, at least there’s a good way to fix it.

Posted by: Anil on May 4, 2005 5:48 PM | Permalink | Reply to this

Thanks, Anil.

When I’m confident I’ve got all the bugs wrung out, I will post my patch file here. In addition to the templates, there are patches to lib/MT/App/CMS.pm and mt.js. So it’s not enough to simply redistribute the templates (alas).

Posted by: Jacques Distler on May 4, 2005 6:06 PM | Permalink | PGP Sig | Reply to this

### Hidden text area?

Have you tried a textarea with a style set to display:none?

Posted by: Sam Ruby on May 5, 2005 3:51 AM | Permalink | Reply to this

First, let’s acknowledge that this aspect of the XML spec is B.A.D.

Second, if a real actual XML processor is being used, if you escape your newlines as ampersand#xa; they should survive.

Posted by: Tim Bray on May 5, 2005 12:13 PM | Permalink | Reply to this

### Line breaks to NCRs

Second, if a real actual XML processor is being used, if you escape your newlines as &#xa; they should survive.

(You don’t need to be afraid of escaping, Tim, my comment-entry form won’t let you enter anything invalid.)

Alas, the pages are generated using HTML::Template. The original code had value="<TMPL_VAR NAME=DATA_VALUE ESCAPE=HTML>". If I substituted &#xa; for the original linefeeds in the data, it would then get double-escaped. Which is not what we want.

Of course, I could remove the ESCAPE=HTML parameter from the template, and do all the escaping in the CGI script, but that would be even more of a bother.

P.S.: Thanks for stopping by. It’s nice to know it’s not just me who thinks the Spec is a bit whacked.

Posted by: Jacques Distler on May 5, 2005 12:35 PM | Permalink | PGP Sig | Reply to this

### Tim Bray

It occurs to me that 2 or 3 of my readers might not know who Tim Bray is, nor what it means when the co-editor of the XML Specification says that a part of it is B.A.D..

Posted by: Jacques Distler on May 7, 2005 11:32 PM | Permalink | PGP Sig | Reply to this

Thanks for sharing this info. I’m in the process of completing a complete re-write of our company’s CMS, among other things, to produce valid accessible code. The admin is, of course, xhtml as well.

I had planned, as a final step before launch, in serving all with XHTML mime type.

I am two months behind schedule as it is.

I ain’t gonna bother.

From what I have read, and you have confirmed, doing so is an enourmous PITA, without (mathml aside) any real advantage.

Posted by: andrew on May 6, 2005 6:42 AM | Permalink | Reply to this

### PITA

I had planned, as a final step before launch, in serving all with XHTML mime type.

The message of this post is that an XHTML MIME type cannot be an afterthought. If you’re not designing (and hopefully, testing, as you go) the site with XHTML in mind, it ain’t gonna work as XHTML.

There’s a long list of stuff that will break. Gez Lemon has a Test Suite of some stuff to look out for. Henri Sivonen maintains a list as part of the Mozilla FAQ. But neither is complete. The hidden form field problem, noted above, just got added to the Mozilla FAQ, yesterday, after I wrote Henri about it.

From what I have read, and you have confirmed, doing so is an enourmous PITA, without (mathml aside) any real advantage.

It’s necessary for what I want to do. So, however big a PITA, it’s worth it for me. Whether it’s worth it for you really depends on your needs and … umh … how far behind schedule you are.

Posted by: Jacques Distler on May 6, 2005 7:57 AM | Permalink | PGP Sig | Reply to this

I stumbled into almost the exact same problem late last night in the process of converting my site over to working as application/xhtml+xml.

I’ll probably post an entry about it in a couple days when I’m finished with my changes…or when I’ve decided to switch to Wordpress (I don’t like their interface so I’m trying to avoid them).

Posted by: Devon on November 22, 2005 4:23 PM | Permalink | Reply to this

### WordPress

I’m not sure WordPress will actually work better for you.

Once-upon-a-time, Matt Mullenweg was serving up application/xhtml+xml, so you had some assurance that the latest WordPress had been tested with an XML MIME-type. Now he no longer is, and Anne van Kesteren is serving HTML 4.01, so, to my knowledge, there’s no one actually checking that WordPress works when served as “real” XHTML.

I’d also point out that the absence of a Validator plugin for WordPress (someone should, at least, write a wrapper around the W3C’s Validator web service) is a severe drawback, if you allow comments 'n such.

Posted by: Jacques Distler on November 22, 2005 7:34 PM | Permalink | PGP Sig | Reply to this

### Re: WordPress’s kitchen

Much as I would like to defend my new love’s honor, you’re right: it’s just pretend XHTML. I installed 2.0b2, dropped in a content-negotiation plugin for the weblog itself, and had workable output (it would need a little s/body/html/ in the stylesheet, but no big deal), but switching the admin interface left me able to use only seven of the nine main menu options (not including “Write,” a rather important feature, although I think I saw a fix for that particular unencoded ampersand pass through svn after 2.0b2).

Maybe that ought to be my community contribution: obscure, pedantic, but generally simple and mechanical; it’s right up my alley.

Posted by: Phil Ringnalda on November 24, 2005 2:43 PM | Permalink | PGP Sig | Reply to this

### Re: WordPress’s kitchen

Hmmm. that would be weirdly ironic. Back when you were running MovableType, and had comment-validation 'n such built into your setup, you were most of the way there, in terms of being able to safely use “real” XHTML.

But you didn’t.

Now that you’ve switched to WordPress, for which none of that bullet-proofing is currently available, you’re thinking of trying it?

I salute you!

The world (or, a very small corner thereof) could certainly use a bullet-proof, XHTML-safe version of WordPress.

Posted by: Jacques Distler on November 28, 2005 2:15 PM | Permalink | PGP Sig | Reply to this

### Re: WordPress’s kitchen

Well, yes, but with Movable Type there wasn’t any point: anything I could find you could find better, anything I could do you’d already done. Judging by the three well-formedness errors I found right off in WP’s admin interface, just being the person who keeps it from straying for the benefit of hypothetical MathML users would be a suitable reason for me serving application/xhtml+xml.

That, and I can’t have you bragging up how your system for selling strings on the street corner is better at concatenation than mine ;)

Posted by: Phil Ringnalda on November 28, 2005 10:46 PM | Permalink | PGP Sig | Reply to this