## May 6, 2007

### So much for Unicode!

Unicode is supposed to cover all of the alphabets in the known universe (Klingon almost made the cut). But, if you look at mathematical scripts, you’ll see a glaring omission. Sure there are Blackboard Bold (𝔹, 𝕓, 𝟛, ℾ, ℽ) and Fraktur (𝕭, 𝖇, 𝔅 𝔟) and Calligraphic (𝓑, 𝓫, ℬ, 𝒷) letters.

But how about what is, perhaps, the most commonly-used variant in High Energy Physics: slashed letters? As in the Dirac operator $\slash{D} = \slash{\partial} + i \slash{A}$ or the fermion propagator $G(p) = \frac{i}{\slash{p} + m + i\epsilon}$

Nope. Not there in Unicode. Which means there’s no MathML markup to produce them either.

Shocking.

Well, no solution is perfect, but the above are produced by overstriking two existing Unicode characters. The result is a little crude, but acceptable.

The (new) itex syntax is

\slash{D}

which produces

<mrow><mpadded width="0.125em"><mo>&#xff0f;</mo></mpadded><mi>D</mi></mrow>

If the MathML experts out there think there’s a better solution, I’d love to hear it.

In the meantime, users can enjoy itex2MML 1.2.2.

#### Update:

Poking around, it occurs to me that another viable alternative to U+FF0F is U+29F8:

$⧸D=⧸\partial +i⧸A$ and $G\left(p\right)=\frac{i}{⧸p+m+iϵ}$

Anyone have a strong preference?

Posted by distler at May 6, 2007 6:36 PM

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/1264

### Re: So much for Unicode!

U+0337: c̷o̷m̷b̷i̷n̷i̷n̷g̷ ̷s̷h̷o̷r̷t̷ ̷s̷o̷l̷i̷d̷u̷s̷ ̷o̷v̷e̷r̷l̷a̷y̷

U+0338: c̸o̸m̸b̸i̸n̸i̸n̸g̸ ̸l̸o̸n̸g̸ ̸s̸o̸l̸i̸d̸u̸s̸ ̸o̸v̸e̸r̸l̸a̸y̸

Posted by: Sam Ruby on May 6, 2007 7:56 PM | Permalink | Reply to this

### Re: So much for Unicode!

U+0337 is too short (definitely for use with capital letters; arguably it’s a little short even for use with lowercase letters, as in your sample).

U+0338 is better. But its slant is all wrong. When paired with upper case letters, it makes them look like poor-man’s double-struck letters (e.g. $̸D$). Same problem with U+002F ($/D$ ).

Moreover, at least in Mozilla/Firefox, the “nonspacing” nature of U+0337,U+0338 is lost in the MathML renderer. So they’re really no better than the other ‘slashes.’

If something better comes along, it’s trivial to change the way itex2MML expands the \slash{} command. In the meantime, the current definition seems to produce the best-looking (thought still not great-looking) output.

Posted by: Jacques Distler on May 6, 2007 9:06 PM | Permalink | PGP Sig | Reply to this

### Re: So much for Unicode!

I don’t know if this is just what Sam Ruby is talking about, but on my browser (about says it’s

“Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.0.10) Gecko/20070223 Fedora/1.5.0.10-1.fc5 Firefox/1.5.0.10”

), the slash is very small (it doesn’t even come up to the middle bar of the A), and though they looks like it might cut through the letter a tiny bit if it were bigger they’re currently all completely in front of the letter they refer to. The reason that I mention this is that I’ve got larger fonts sizes than the defaults set in my configuration, so it might conceivably be that it’s picking the slash from a font at the “default size” for some reason. Or might not; just thought I’d mention it.

Posted by: dave tweed on May 8, 2007 10:18 AM | Permalink | Reply to this

### Re: So much for Unicode!

Sam was pointing out the existence of Unicode composing accents (U+0300–U+036F). Among them are slashes: one for lower-case (U+0337) and one for upper-case (U+0338) letters.

As far as I can tell, there are two problems with using those composing slashes for the purpose at hand in MathML.

1. At least in Mozilla/Firefox, composing accents don’t “compose” when used in MathML. This can be “fixed” by hand, by using the <mpadded> element.
2. These combining slash accents were designed for use with upright letters. MathML renderers use italic letters for single-letter tokens. As far as I can tell, there aren’t any available italic variants of U+0337,U+0338. Using the upright variants with math italic letters produces horrible results.

Instead, I decided to avail myself of either U+FF0F (full width solidus) or U+29F8 (big solidus).

1. These have the right slants to work with math italic letters.
2. At least, with the fonts I have available, they are, if anything, a little too high (not too low).

(Ideally, I’d like to use CSS to lower them a trifle. A little more symmetry between the depth below the baseline and the height would vastly improve their appearance. Unfortunately, I haven’t figured out how to do that.)

If you’re seeing them as too short, I’d be very curious. If you could email me a screen-grab and, ideally, your best guess for what font is being used for these glyphs, I’d be most grateful.

Posted by: Jacques Distler on May 8, 2007 12:46 PM | Permalink | PGP Sig | Reply to this

### Combining accents

I wrote:

Moreover, at least in Mozilla/Firefox, the “nonspacing” nature of U+0337,U+0338 is lost in the MathML renderer. So they’re really no better than the other ‘slashes.’

I should say that the way the Gecko SVG and MathML renderers treats combining accents (U+0300–U+036F) is very, very strange, and deserving of a whole 'nother post (one I’m probably not going to write; I would need to do considerable research to sort out what the heck is going on).

Posted by: Jacques Distler on May 7, 2007 12:28 AM | Permalink | PGP Sig | Reply to this

### Re: Combining accents

To begin with, you can expect combining characters to be b0rked in Gecko 1.8 on Mac.

Better luck with Gecko 1.9 once the MathML patch for Thebes has been checked in.

Posted by: Henri Sivonen on May 11, 2007 3:49 PM | Permalink | Reply to this

### Re: Combining accents

Better luck with Gecko 1.9 once the MathML patch for Thebes has been checked in.

You make it sound as if there’s a working patch waiting for review. I would have guessed that we’re still a little far from that, yet.

Posted by: Jacques Distler on May 11, 2007 4:00 PM | Permalink | PGP Sig | Reply to this

Post a New Comment