## May 22, 2004

### WordPress 1.2, MathML Goodness

#### Update (3/21/2005):

With WordPress 1.5, many of the troubles discussed below have gone away. A new version of the plugin, along with simplified setup procedure are detailed here.

WordPress 1.2 has just been released. Congratulations to Matt and his team for numerous improvements and a shiny new plugin architecture!

In celebration of the event, I’m releasing an itexToMML plugin for WordPress 1.2 and above. This brings easy-to-use mathematical authoring to the WordPress platform.

Installation involves a few simple steps.

1. First, you need to download and install the itex2MML binary. There are precompiled binaries for Linux and Windows and a precompiled MacOSX binary is included with my source distribution.
2. Edit line 22 of the plugin to reflect the location where you installed the binary. By default, it says
$itex2MML = '/usr/local/bin/itex2MML'; 3. Install the plugin as wp-content/plugins/itexToMML.php 4. Apply the following patch, which makes sure that the installed text-filtering plugins — wptexturize, Textile (1 and 2) and Markdown — play nice with MathML content. (These changes will, hopefully, be in the next release of WordPress.) 5. Activate the plugin in the administrative interface. 6. Start serving your blog with the correct MIME Type. 7. If you want people to be able to post comments in itex, add the requisite list of MathML tags to your mt-hacks.php file. That’s the good news. The bad news is that WordPress 1.2 has a serious bug, which renders the plugin nearly useless for serious work. Like its ancestor, b2, WordPress eats backslashes. Type “\\a” in the entry form, and “\a” gets posted to your blog. Re-edit the post, and “\a” gets turned into “a” when re-posted. Since TeX relies heavily on backslashes, this is a pretty debilitating feature. Hopefully, it’ll get fixed soon. The other thing that is less than ideal is that enabling the plugin is all-or-nothing. When enabled, all your posts and comments get filtered through itexToMML, even those with no math in them. That’s rather wasteful of resources. But, again, I’m pretty sure that this will have to change in subsequent versions of WordPress. Forget about the people using itexToMML. Consider the choice of text filters for composing posts. Currently, there are four: wptexturize (the default), Textile1, Textile2 and Markdown. Say you have been using Textile for a while and decide one day to switch to Markdown. Guess what? You can’t! If you disable Textile and enable Markdown, this choice applies to all your posts. But the syntaxes of these two markup dialects are incompatible. Your old posts will break horribly if you switch. Once you’ve accumulated a body of posts using one text filter, you are basically stuck, regardless of whether something better comes along, tempting you to switch. MovableType lets you assign a choice of text filter to each of your posts individually. If you decide one day to switch from Textile to Markdown, your old posts don’t break, because they still get processed with Textile. I added the ability to assign a choice of text filter to each comment in MT. That way, commenters can compose their comments in their favourite idiom, rather than yours. It seems to me that, once you start giving people a choice of text filters for formatting their posts, it’s inevitable that you’ll need to allow them to make that selection on a per-post basis. WordPress actually allows multiple text filters to be applied to (every) post. If you want to use itexToMML with Textile formatting, you just activate both plugins. In MovableType, I had to create a third text filter plugin, whose sole purpose was to daisy-chain the other two together. It will be cool to see how WordPress eventually handles this. Perhaps there will be a set of checkboxes in the composition window, letting you select which text filters apply to the post you’re composing. But all that is for the future. Right now, WordPress users have a shiny new toy to play with. I hope they enjoy my small addition to the party. #### MIME Types for WordPress Those familiar with this blog will know that to get MathML to render in Gecko-based browsers (Netscape 7, Mozilla, Firefox,…) and in IE/6 with the MathPlayer 2.0 plugin, you need to serve your pages as application/xhtml+xml. My MovableType solution involves using mod_rewrite to set the HTTP Content-Type headers. In WordPress, as in any PHP-based system, it’s probably preferable to set the headers directly in your PHP code. It would be great if someone wrote up a definitive guide to doing this in WordPress. Unfortunately, most of the existing instructions, like Simon Jessey’s are written under the misapprehension that the correct thing to do is to set the Content-Type based on the Accept headers sent by the browser. This is wrong. It may be “morally correct,” but it doesn’t actually work with real-world browsers. Both Camino and Opera 7.5 include application/xhtml+xml in their Accept headers. Both cough up hairballs when served XHTML+MathML content with that MIME type. IE/6, with the MathPlayer 2.0 plugin installed, handles application/xhtml+xml (either straight XHTML or XHTML+MathML) just fine, even though it doesn’t say so in its Accept headers. The only correct thing to do is to send the MIME type based on the User-Agent string sent by the browser. Anybody want to take a crack at writing up some instructions for WordPress? #### Update (5/24/2004): Josh comes through with the following PHP code, if ( (preg_match("/Gecko|W3C_Validator|MathPlayer/",$_SERVER["HTTP_USER_AGENT"])
&& !preg_match("/Chimera|Camino|KHTML/",$_SERVER["HTTP_USER_AGENT"])) || preg_match("/Camino.*MathML-Enabled/",$_SERVER["HTTP_USER_AGENT"]) ) {
print('<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" "http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd" >
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
');
}
else {
print('
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
');
}

to be placed in wp-blog-header.php or at the top of whatever pages need to be served with the correct MIME type. Either way, you need to remove the hard-coded DOCTYPE declaration and opening <html> tag in the affected pages.

Posted by distler at May 22, 2004 9:04 AM

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/367

### Re: WordPress 1.2, MathML Goodness

Put

if ( stristr($_SERVER["HTTP_ACCEPT"],"application/xhtml+xml") || strstr($_SERVER["HTTP_USER_AGENT"],"W3C_Validator") ) {
print('<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN" "http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd" >
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
');
}
else {
print('
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
');
}

in wp-blog-header.php.

Then remove

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

(You may, of course, want to send the MathML doctype to both; This is just an adaptation of what I use to send the XHTML 1.1 doctype.)

Posted by: Josh on May 24, 2004 9:51 AM | Permalink | Reply to this

### Re: WordPress 1.2, MathML Goodness

I officially loathe your text filters.

http://whatnottodo.org/junk/php-mathml.txt

Posted by: Josh on May 24, 2004 9:56 AM | Permalink | Reply to this

### Text filter letdown

Sorry. Surely, out of the 7 available, you could find one that didn’t suck.

Maybe next time …

Anyway, I fixed your comment, even though it is, as I emphasized above, the wrong solution. The Accept headers are not a reliable indicator of whether a browser can handle XHTML+MathML, served as application/xhtml+xml.

Posted by: Jacques Distler on May 24, 2004 10:24 AM | Permalink | PGP Sig | Reply to this

### Re: Text filter letdown

I realized my mistake(s) after I submitted it. (Remember, this wasn’t for MathML.)

Corrected at the URL above. (If you’re wondering: yes, there is probably a better way to do it, but I can’t think of it right now.)

Posted by: Josh on May 24, 2004 10:30 AM | Permalink | Reply to this

### Re: WordPress 1.2, MathML Goodness

Would I be mistaken to guess this means you’re a WP guy and not an MT guy any more?

Posted by: Joe Grossberg on May 24, 2004 3:03 PM | Permalink | Reply to this

### Loyalties

I’m “my own” guy. I like the idea of having options available to me.

Right now, I could not run this blog on WordPress, or any other platform, even if I wanted to. I’d like to see the technology pushed to the point where I could switch platforms if I so chose.

Also, I’m not just doing this stuff for myself. There are other people using my software to run mathematically-oriented weblogs. A lot of them are thinking about WordPress.

I live to serve …

Posted by: Jacques Distler on May 24, 2004 3:42 PM | Permalink | PGP Sig | Reply to this

### Re: WordPress 1.2, MathML Goodness

I must admit that I haven’t tested your itex2mml tool with my java based blog/wiki environment (see URL).
But if I look into the php source it seems to me, that I can also use it to convert the tex sections with itex2mml. (This should become a wiki with a syntax very close to Wikipedia)
I have a question regarding performance.
In a wiki document the system would transform many single tex formulas with itex2mml (and store the results in a special internal cache; the wiki-engine must do this, because it’s possible to include dynamic macros in the wiki, which can’t be cached).
Must I create a single process for each “formula-snippet” or is there a way to “communicate” with a running instance of itex2mml (for example through sockets on Linux server :-))?

Posted by: Klaus Hartlage on May 27, 2004 5:10 AM | Permalink | Reply to this

### pipes

In the blog context, you get one itex2MML instance for each blog entry (and, in most blogging systems, for each comment), rather than for each formula.

I’m unclear why, in the Wiki context, you need to run it once per formula. Why not feed it the whole Wiki entry and cache the page, rather than caching each individual formula? That just seems broken to me.

Still, I think performance is an issue, which can be addressed by turning itexToMML into a PHP Extension, rather than using a bidirectional pipe to an external itex2MML process.

Doing that requires access to the PHP source, which makes the whole thing a little less “portable” than my current solution.

One could equally-well do the same thing with the Perl plugin used here, but since MovableType serves static pages, the performance of my plugin is really not an issue.

Posted by: Jacques Distler on May 27, 2004 10:38 AM | Permalink | PGP Sig | Reply to this

### Re: pipes

Yes you’re right I can run through the itex2mml “filter” in the first step and after that do any “Wiki transformings”.
But if I think of the Wikipedia Syntax (which I would like to use in the future) where formulas are also embedded in “math”-tags (see MediaWiki User-Guide: Editing mathematical formulae ) these are not rendered “out of the box” and currently need some preprocessing.
Is it possible to have a switch in itex2mml to enable converting these math sections?
Another question is:
How compatible is itex2mml with the wikipedia tex syntax?

Posted by: Klaus Hartlage on May 28, 2004 3:40 AM | Permalink | Reply to this

### Playing nice

I haven’t looked at WikiPedia syntax, but I have gotten itex2MML to work well with both Textile and Markdown.

The key, as you surmised, is to run the text through itex2MML first, converting everything between “$...$” and “$...$” to its MathML equivalent, “$...$”.

Then you need to tell Textile or Markdown or whatever to ignore everything inside “$...$” tags.

That’s what the above patch (for WordPress) was about. I have similar patches for the Textile and Markdown plugins in MT.

Posted by: Jacques Distler on May 28, 2004 9:56 AM | Permalink | PGP Sig | Reply to this
Read the post X(HT)ML, MathML, MIME types, PHP...
Weblog: It's equal but it's different
Excerpt: Well folks, as you know, it’s a jungle out there: people write all kinds of code for webpages and the last thing they care about are the ‘web standards’ (more links can be found at Cascading Style Sheets, Web Accessibility...
Tracked: February 9, 2005 10:39 PM

### Re: WordPress 1.2, MathML Goodness

Great, that works fine. Good the US universities have these resources online. Now we can learn interactively. Darlehen says best regards from Germany!

Posted by: Darlehen on January 9, 2006 8:30 AM | Permalink | Reply to this

### Re: WordPress 1.2, MathML Goodness

I have a question regarding performance.
In a wiki document the system would transform many single tex formulas with itex2mml (and store the results in a special internal cache; the wiki-engine must do this, because it’s possible to include dynamic macros in the wiki, which can’t be cached).

Posted by: kris on May 16, 2007 3:00 AM | Permalink | Reply to this

### Wiki performance

In my branch of Instiki, itex2MML handles the formula-conversion, and performance is amazingly fast.

In part, this is because itex2MML has native Ruby bindings. In part, it’s because you are wrong about caching.

Posted by: Jacques Distler on May 16, 2007 7:17 AM | Permalink | PGP Sig | Reply to this

Post a New Comment