## September 5, 2010

### Figures

My last paper was written almost entirely on an Instiki wiki, transferred to a TeX file (in an SVN repository) only a few days before submission. The 136 figures were prepared using Instiki’s nifty built-in WYSIWYG SVG-editor.

Unfortunately, when it came time to “export” the figures to accompany the TeX file, I realized – to my horror – that there did not exist a tool for converting SVG, with embedded MathML, into PDF. We ended up producing PNG bitmaps (essentially, taking screen shots in the browser). That was time-consuming (thank heavens for a graduate student co-author) and the results were less than completely satisfactory.

So, with the paper squared-away, and the semester well underway, I sat down to put together a better solution.

Inkscape (for instance) has a quite nice ability to export SVG to PDF. But it is completely oblivious to the embedded MathML. Luckily, I found an (apparently inactive) project which specialized in “rendering” MathML to SVG. My attempts to contact the author were in vain, so I launched in, to fix what needed fixing:

• It didn’t support OpenType fonts, which meant that — without some modification, it wouldn’t work with the STIX fonts.
• It didn’t really work right when converting MathML inside a <foreignObject> element in an SVG document. Presumably, this was not a use-case considered by the author, but fixing it was trickier than I thought it would be.
• There was an amusing bug in the Python SAX parser, which needed fixing.

Anyway, there’s now a working version of SVGMath available, with all of the above fixes. (To be honest, it could still use a decent setup.py to be a proper Python package.)

With that in hand, a simple script1

#!/usr/bin/env ruby
require 'tempfile'
IO.readlines(ARGV[0]).join.scan(/(<svg.*?<\/svg>)\s*\\end\{svg\}\\includegraphics$.*?$\{(.*?)\}/m) do |svg, outfile|
puts "Converting #{outfile}.pdf"
f = Tempfile.new("convert_svg") << svg
f.close
math2svg.py #{f.path} > #{f.path}_fixed.svg; \
/Applications/Inkscape.app/Contents/Resources/bin/inkscape -z -f #{f.path}_fixed.svg -A #{outfile}.pdf
File.delete(f.path, "#{f.path}_fixed.svg")
end

sufficed to convert the wiki source into 136 PDF files, containing the figures.

Now, I can be as proud of the figures as I am of the physics (which is, I must say, quite neat).

1 As a Python newbie, I’m curious to know how someone, better-acquainted with the language, would write that in Python.

Posted by distler at September 5, 2010 4:34 PM

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2271

### Re: Figures

Untested conversion:

import os, re, sys, tempfile

pattern = r'<svg.*?<\/svg>)\s*\\end\{svg\}\\includegraphics$.*?$\{(.*?)\}'
for svg,outfile in re.findall(pattern, open(sys.argv[1]).read(), re.M):
print "Converting %s.pdf" % outfile
temp_fd, temp_path = tempfile.mkstemp('', 'convert_svg')

fh = os.fdopen(temp_fd, 'w')
fh.write(svg)
fh.close()

system("math2svg.py %(temp_path)s > %(temp_path)s_fixed.svg; \
/Applications/Inkscape.app/Contents/Resources/bin/inkscape -z -f \
%(temp_path)s_fixed.svg -A %(outfile)spdf" % locals())

Posted by: Sam Ruby on September 5, 2010 8:53 PM | Permalink | Reply to this

### Re: Figures

My last paper was written almost entirely on an Instiki wiki, transferred to a TeX file (in an SVN repository) only a few days before submission.

That’s interesting.

So far I had two occasions to include considerable Instiki wiki-code into a LaTeX paper, and both times I found I had to do a lot of re-editing by hand.

At some point in the beginning I remember I tried the automatic conversion to TeX and found that wanting to the degree of not being usable. However, I forget what exactly went wrong, I just remember the feeling of it being a big mess.

One problem I remember were the hyperlinks. At least the way I am using them (extensively) in Instiki (namely like cross-references in a lexicon) they ought to simply disappear in the LaTeX source. But I think they didn’t.

Then next I resorted to doing copy-and-paste from the Instiki source code into a LaTeX file. For simple text and formula code that works fine, but this also does require quite a bit of fiddling when the code is a bit more sophisticated, of course since the syntax say for boldface in text or that for arrays in formulas is different in both cases.

Posted by: Urs Schreiber on September 6, 2010 5:02 AM | Permalink | Reply to this

### Re: Figures

…but this also does require quite a bit of fiddling when the code is a bit more sophisticated, of course since the syntax say for boldface in text

Huh? **this** is automatically converted to \textbf{this}. Were you expecting something different to happen?

or that for arrays in formulas is different in both cases.

A LaTeX translation for itex’s \array command is not implemented. But I have rarely found any need for it (which couldn’t be satisfied by using the matrix, aligned or gathered environments).

Two things needed changing, when exporting from our wiki. One was that

\overset{\qquad SU(2) \qquad}{\longleftrightarrow}

does not function (in LaTeX) as a synonym for

\xleftrightarrow{\qquad SU(2) \qquad}

which is what I thought it would do. But that was a global-replace in my text editor (which, as any good text editor does, implements Regexp matching).

The next version of itex2MML will contain the \x*arrow{} commands. In itex, they will function synomously with the \overset{}{} construct, but will export to LaTeX properly.

The other was the tables, which needed tweaking (and breaking across pages, since several of them spanned multiple pages). But, that was something I expected to have to do.

Other than that, the conversion went very smoothly. This (at 61 pages and 136 figures) was the longest paper I’ve written this way, but most of my recent papers (and all of my recent grant proposals) started their lives on an Instiki wiki. I find that a much better collaboration tool than a shared LaTeX document. SVN (which I also use) makes the latter tolerable (as opposed to the practice — of some of my collaborators — of emailing a LaTeX file back and forth, which I find completely intolerable). But I find the wiki more condusive to jotting down results as we find them.

Posted by: Jacques Distler on September 6, 2010 8:28 AM | Permalink | PGP Sig | Reply to this

### Re: Figures

Hi Jacques,

I have been using instiki for collaboration as you mention, and while it is far better than emailing Latex files I have been wanting it to be faster.

The nlab’s hosting server seems to be extremely fast, so I see instiki is potentially very fast.

Do you have any tips for speeding up a local installation which is used for only a couple of people? Even when my connection to the server is good (45ms ping), it takes about 5-8 seconds to load even virtually empty pages. I have upgraded the Webrick to a Mongrel but to no effect.

Thanks

Posted by: marco gualtieri on October 29, 2010 12:24 AM | Permalink | Reply to this

### Re: Figures

Hard to say, without knowing more about your hardware/software configuration.

Let’s start with the fact that Golem is an iMac, sitting on my desktop at work. In other words, distinctly run-of-the-mill, consumer-grade hardware.

One thing you can do to speed up Instiki is to switch from Ruby 1.8.x to 1.9.2. The latter is a good bit faster, and both the nLab and the Instiki installation on Golem have been running happily on 1.9.2 for a while.

(You can install 1.9.2 alongside 1.8.x, using RVM, or you can follow a more pedestrian approach (which is what I did, ignoring the unnecessary bit about readline). Either way, you’ll have to rerun bundle, because Instiki keeps the installed Gems for Ruby 1.8.x and 1.9.x segregated.)

The other thing to understand is that Mongrel and WEBrick are single-instance servers. So, in its default setup, Instiki can only service one request at a time.

There are various techniques to rectify that, but far-and-away the easiest is to use Passenger, which can run multiple Instiki processes, behind an Apache webserver. Both the nLab and my Instiki installation run on Passenger.

Posted by: Jacques Distler on October 29, 2010 1:19 AM | Permalink | PGP Sig | Reply to this

### Re: Figures

Thanks for the tips. I am trying to implement them now.

1. RVM is great and is very easy, even for me. The first thing I did was to install Ruby 1.9.2 but then installing instiki fails on mysql apparently, I thought you should know - here’s the error:

http://dl.dropbox.com/u/4747602/1.9.2-install-mysqlerror.txt

I did the same for Ruby 1.8.7 but it worked perfectly fine.

2. I noticed MathJax is now involved… No doubt you will soon tell us how to turn it on!

Cheers
Marco

Posted by: marco gualtieri on October 31, 2010 11:34 PM | Permalink | Reply to this

### Re: Figures

http://dl.dropbox.com/u/4747602/1.9.2-install-mysqlerror.txt

Very odd. Says it can’t find the development tools. That’s surely not correct. From Googling around, it sounds like the real problem is that RVM wants you to install the readline package.

N.B.: I did not encounter this issue (I compiled 1.9.2, myself, instead of using RVM). But if it is a repeatable issue with Instiki+RVM+MySQL+MacOSX, it might be good to document it here.

I noticed MathJax is now involved… No doubt you will soon tell us how to turn it on!

Try accessing your Instiki wiki, using a non-MathML-capable browser (say, an iPhone).

• Mathjax is very slow.
• Its renderings is rather dodgy, if you don’t have the STIX fonts installed (which you don’t, on that iPhone), simply because of a lack of glyph coverage.
• and doesn’t support some things that I view as important (say, being able to mix MathML and SVG).

But, hey, it’s a lot better than nothing, and I’ve set it up so that it doesn’t interfere at all with the user-experience on MathML-capable browsers.

Posted by: Jacques Distler on November 1, 2010 12:27 AM | Permalink | PGP Sig | Reply to this

### Extensible arrows

I wrote (apropos of annoyances with Instiki’s LaTeX export):

One was that

\overset{\qquad SU(2) \qquad}{\longleftrightarrow}

does not function (in LaTeX) as a synonym for

\xleftrightarrow{\qquad SU(2) \qquad}

itex2MML 1.4.1 has support for

• \xrightarrow[]{}
• \xleftarrow[]{}
• \xleftrightarrow[]{}
• \xLeftarrow[]{}
• \xRightarrow[]{}
• \xLeftrightarrow[]{}
• \xleftrightharpoons[]{}
• \xrightleftharpoons[]{}
• \xhookleftarrow[]{}
• \xhookrightarrow[]{}
• \xmapsto[]{}

where the first (optional) argument is a subscript, and the second (mandatory, but can be blank) argument is a superscript. These are compatible with the LaTeX output from Instiki.

Posted by: Jacques Distler on September 7, 2010 10:14 PM | Permalink | PGP Sig | Reply to this
Read the post Instiki 0.19
Weblog: Musings
Excerpt: Finally ...
Tracked: September 28, 2010 12:44 PM

### Re: Figures

One problem with exporting from Instiki to LaTeX now is that the converter included [[!redirects ...]] and [[!includes ...]] almost literally (dropping only the brackets). It ought to drop the first entirely and actually implement the second. (At least, this is what happens as implemented on the nLab, and Andrew says that the nLab has an updated version of Instiki as of yesterday.)

Posted by: Toby Bartels on October 5, 2010 12:26 PM | Permalink | Reply to this

### Re: Figures

More generally, the conversion is a (Markdown+itex)→LaTeX conversion. It knows nothing about “wiki syntax” (of which you gave two examples).

Posted by: Jacques Distler on October 5, 2010 1:55 PM | Permalink | PGP Sig | Reply to this

### Re: Figures

But it does know something about wiki syntax, and it handles most wiki syntax well (or as well as it could without creating links): it drops the brackets. But with these examples, dropping the brackets is not the correct behaviour.

Posted by: Toby Bartels on October 11, 2010 11:22 AM | Permalink | Reply to this

### Re: Figures

No, sorry, it knows nothing about Wiki Syntax.

Square brackets are significant in Markdown, too, and what you are probably seeing is Maruku’s Markdown error-handling.

LaTeX conversion is a feature of the Markdown-flavoured text-engine(s). Wiki Syntax processing takes place at a different level, independent of the text-engine chosen.

Posted by: Jacques Distler on October 11, 2010 11:31 AM | Permalink | PGP Sig | Reply to this

### Re: Figures

Hey awesome post!

What do you think about incorporating this into CSS for wordpress?

Posted by: Chad on October 21, 2010 2:27 PM | Permalink | Reply to this

Post a New Comment