## September 10, 2003

### <abbr>, <acronym>, Accessibility & Automation

Nothing seems to sow more confusion among markup geeks than the proper usage of the <abbr> and <acronym> elements. Many discussions get sidetracked by the arcana of English grammar (“initialisms” versus “contractions” versus …) rather than focussing on the real issues. HTML is not English, and one should not confuse a discussion of the one with a discussion of the other. In HTML, these elements play two important roles:

1. Provide a definition of the term via the title attribute, as in
<acronym title="Friend Of A Friend">FOAF</acronym>
<abbr title="Cold Dark Matter">CDM</abbr>
In addition to helping out readers who may be unfamiliar with the term, this can be useful to Search Engines (by informing them that a page making frequent reference to “CDM” is actually related to pages talking about “Cold Dark Matter”).
2. Provide a cue to screenreaders (or other assistive technologies) as to whether the term should be pronounced or spelled-out. This might be made explicit in an aural stylesheet
abbr {speak:spell-out;}
acronym {speak:normal;}
or it might be implicit in the rules used by the screenreader.

If the only purpose were to provide a definition, there would be no point in having two distinct element which serve the same function. And, given that there are two elements, it wouldn’t much matter which one you used. On the other hand, if these two elements are to be treated differently by aural browsers, then, on Accessibility grounds, you should be mindful of using the right one in each case.

Which brings us to the next point. For the purpose of providing a definition, it suffices to mark up the first occurrence of an acronym or abbreviation on the page. It’s somewhat annoying to see a page littered with dotted underlines (indicating the presence of a tooltip giving the definition) on each and every occurrence of an abbreviation. Indeed, the WCAG says you only need to provide the definition at the first occurrence on the page.

But that’s not quite the same thing as saying you don’t have to mark up subsequent occurrences at all. Providing cues for screenreaders surely requires marking up all occurrences on the page. (I imagine that this might also help Search Engines, but that’s purely hypothetical.) The best way to achieve these varied goals is, the first time you use the abbreviation, to give its definition, <abbr title="Cold Dark Matter">CDM</abbr>, but on subsequent uses, to merely mark it up as <abbr>CDM</abbr>. This will satisfy the screenreaders, and can be hidden from visual browsers with code like

abbr, acronym {border:none;}
abbr[title], acronym[title] {border-bottom: 1px dotted black;}

Marking up all those abbreviations and acronyms by hand is pretty tedious. Which is why I recently installed the acronym plugin for MovableType. It uses a flat-file database, acronym.db to hold the definitions. It only supports the <acronym> element, but it is trivial to hack it to support <abbr> as well (with two separate databases, acronym.db and abbr.db). If you prefer the old behaviour, you can just put all your definitions in the acronym.db. Since it’s reasonably well-written, getting it to add a title to only the first instance of an acronym/abbreviation in each blog entry was also fairly straightforward. Once it’s installed, you activate it by adding the attribute, acronym="1", to any MT container tag (like <MTEntryBody>) and rebuild.

Anyway, here’s my patch to the acronym plugin (patch < acronym.pl.diff). You need to install the patched plugin and two data files, acronym.db and abbr.db (the latter can be empty, if you wish, but must exist), in your plugins directory.

One note of caution: the plugin is very fast because it doesn’t retokenize after each acronym/abbreviation substitution. This has an unfortunate side-effect. Don’t put other acronyms/abbreviations as part of the definition text in your database!

P.S.: Some people contend that, since Internet Explorer doesn’t understand the <abbr> element, you shouldn’t bother doing the right thing. There are two obvious responses: you could use a Javascript solution, or you could just not worry about it. Since it seemed to me that IE users need all the help they can get, I decided, in this case, to go the Javascript route. Besides, the idea of using crufty Javascript hacks to service a crufty browser seemed … only fitting.

Update (9/12/2003): I should have made an obvious point. If your page is sufficiently complicated and has multiple entry points, a visitor may not enter the page at the top, and hence might miss the definitions found there. In that case, it may not suffice to define each abbreviation once per “page.” It would be more usable to define it once per “section.” (You can see this on my blog’s main page, where each post is a separate section, vis-a-vis marking up abbreviations.) Also, you’ll notice that I haven’t enabled automated abbreviation markup for comments. Some people think that that’s a “service,” but if you leave a comment on one of my posts that says, “Jacques, I think you have succumbed to Steve Jobs’ infamous RDF.” I don’t think you are going to want that to be automatically marked up as “RDF.”

Update (10/29/2003): The patch has been updated to match version 0.5 of the plugin.

Update (12/20/2003): The patch has been updated to match version 0.6 of the plugin.

Update (2/13/2004): The patch has been updated to match version 0.7 of the plugin.

Update (5/28/2004): The patch has been updated to match version 1.0 of the plugin.

Posted by distler at September 10, 2003 2:26 AM

TrackBack URL for this Entry:   http://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/218

### Re: <abbr>, <acronym>, Accessibility & Automation

I hope you still read this :). I have a few questions:

1. Will this apply to older entries?
2. How can I patch something? (I’m not such a PERL fanatic.

It is really great that it only sets the title attribute the first time. I always wanted to do that, but it was impossible, ‘cause I used a macro for replacing everything.

Posted by: Anne van Kesteren on September 13, 2003 1:58 AM | Permalink | Reply to this

### diff and patch

I hope you still read this :).

Will this apply to older entries?

If you rebuild your whole site, it will apply to everything. If you rebuild only particular entries, it will apply only to those.

How can I patch something? (I’m not such a PERL fanatic.

PERL, shmerl!

You need to learn about the UNIX commands “diff” and “patch” (which were, truth be told, created by Larry Wall, the same man who created Perl).

Posted by: Jacques Distler on September 13, 2003 2:13 AM | Permalink | Reply to this
Read the post abbr ou acronym, lequel choisir ?
Weblog: Karl & Cow - Le carnet Web
Excerpt: Using the two elements creates accessibility and internationalization problems. It's better to use only one and use CSS for accessibility reasons.
Tracked: September 13, 2003 10:15 AM

Karl Dubost wins the prize for the first trackback with an invalid <MTPingBlogName>. I’d been wondering when that was going to happen. Congratulations, Karl!

Since Karl doesn’t have comments enabled on his blog, I’ll respond to his points here.

He doesn’t like the idea of using <abbr> and <acronym>. Rather, he thinks one should use a single element, <acronym class="spell"> and <acronym class="pronounce">, and rely on Aural CSS styling

.spell {speak: spell-out;}
.pronounce {speak:normal;}

This is, on its face, a horrible idea. You should never use a generic element with made-up CSS classes, when your HTML dialect provides you with native elements which mean the same thing. Down that road lies madness like <div class="paragraph"> and <span class="quote">. You’d better have a damned good reason for not using the native elements.

So what are Karl’s reasons?

Screenreader support for <abbr> is poor.

Sorry, Karl, you’ve got that backwards. Jaws 4.51, among other screenreaders, supports both <abbr> and <acronym>. It is support for Aural Stylesheets that is virtually nonexistent. Only EmacsSpeak, which has a negligibly small user base, supports Aural CSS.

It’s a barrier to internationalization. In English, we’d write

<abbr title="United Nations">UN</abbr>

whereas in French, we’d mark this up as

<acronym title="Organisation des Nations Unies">ONU</acronym>

Well, OK. We had to change elements there, but keeping the same element while changing classes isn’t any easier. The real answer is that you shouldn’t be marking up these abbreviations in your base document. Rather, that should be part of your (language-specific) post-processing phase. Even for a single language, you might decide one day that “RSS” stands for “Really Simple Syndication,” rather than for “RDF Site Syndication.” Naturally, you’d be much happier if your previous choice wasn’t hard-coded into all your documents.

The distinction between <abbr> and <acronym> is “presentational,” rather than “semantic.”

Oooh! Them’s fightin’ words!

This is a very slippery argument. At some very crude level, almost everything can be construed as presentational. I am strongly reminded of the debate in the W3C mailing list on whether <sub> and <sup> are “presentational.” After all, they are nothing more than instructions to take a block of text and set it in slightly smaller type just below or just above the baseline.

My rule of thumb for deciding such matters is that, if something is really just “presentational,” then nothing will happen to the meaning if we alter the presentation.

Let’s try it: instead of “x2,” let’s try “x2” or “x2” or “x2”. Did we lose the meaning (“x-squared”) of the original? Yup. So there must have been some meaning in that supposedly “presentational” tag.

OK. How about Karl’s example? Instead of spelling out the abbreviation, “UN mandate” (“yew-en mandate”), let’s try pronouncing it (“unh mandate”). Did we still convey the meaning? Nope.

So much for “purely presentational.”

Finally, Karl points out:

XHTML 2.0 will have only one element, <abbr>. (Karl, inexplicably, wishes it were <acronym> instead.)

This seems to me to be a non-issue, as

1. We’re talking about a Working Draft. Who knows what the final Spec will say?
2. I would venture that Aural CSS support will be commonplace in screenreaders long before any of them support XHTML 2.0

More than that, I am unwilling to discuss, as XHTML 2.0, as currently formulated, has zero interest for me. I have other fish (or W3C Standards) to fry …

Posted by: Jacques Distler on September 13, 2003 11:03 PM | Permalink | Reply to this

For the MTBlogPing ;), Sorry I’m not using Movable Type to create Trackback when it’s necessary but http://www.reedmaniac.com/scripts/trackback_form.php which seems doesn’t escape correctly title. I guess you had problems with the & ;)

You have mistaken some parts of my comments on my weblog.

Slippey road of not using semantics tags to markup: I completely agree with you on that. My point is using the appropriate tags for the appropriate choice of semantics. Which is completely different. Semantics means that the text you are marking has meaning on its own.

If you really want to use acronym or abbreviation, it’s not a question of spelling out or pronouncing it the word in french, it’s a question of the way the word is built.

Inc = Incorporated, Ltd = Limited, Mlle = Mademoiselle. The rule for these words is always to pronounce them completely. so if you put one word like that with an abbr tag, in french it would have to use the title to pronounce it. It’s not a matter of pronouncing the text in it.

Acronym (in french) is composed of letters of words and could be pronounced and these words became common words in french: OVNI = Objet Volant Non Identifié (UFO in english), RADAR, SIDA

We have a third catégory in french which is called “sigle”, they are written like O.N.U. but can be pronounced as a “sigle” like “O”, “N”, “U” or can be pronounced as a (french) acronym “onu”. Some “sigles” can’t be pronounced as acronym for example: C.G.T., but it’s still a sigle.

So as you can see the concept of abbr and acronym has a strong internationalization problem depending on the language, you are using. I didn’t verify in an english dictionnary the exact definitions (I will later when I have one under my hands).

HTML is not a markup language for english, it’s a markup language for people in the world and you have to be very careful how the concepts translate in another culture/language.

The spelling-out behaviour of a word is completely presentationnal and not semantics. It doesn’t mean it’s not important for accessibility. But it’s orthogonal, in the same way that being valid, semantically correct, accessible are three orthogonal concepts.

I didn’t say screenreaders support are poor. I said practices of people (unfortunately) have used in their Web pages acronym because some visual browsers where only presenting acronym differently and like people have a lack of culture on accessibility, they didn’t care.

The way in your first article you are using the elements is definitely presentationnal (in the sound way of presentation). Now I agree with you, the presentation in our common world convey semantics. For example, you are used when you see italics that it’s a citation of someone else, but semantically if you use the element “i” or “em”, you are using the wrong element. You have to use in HTML 4.01 for example, the element q.

The element q in many browsers doesn’t have a particular rendering, so you will say you loose the semantics meaning when a person read it on a web page. It’s why CSS are here to add the necessary layout for a particular rendering, on the other side. A good screen reader will certainly said reading the html and the element q. “Citation” and will read the sentence. So no need of visual layout.

It’s exactly the same for your case. What you are demonstrating for the importance of spelling out or pronounce a word is part of the presentationnal part for people with sight disabilities. The sounds will convey a meaning where people without this kind of disability will have no problem reading it. It’s not a question of semantics as I have shown you before.

For XHTML 2.0, I said it was a future specification, and that it was not to be used now. It will also not changed the fact how you do markup your HTML 4.01 or XHTML 1.0, which will still be valid.

Last but not least: My comment was a personal point of view even if I’m the Conformance Manager of W3C. http://www.w3.org/People/karl/

Cheers.

Posted by: Karl Dubost on September 14, 2003 8:45 AM | Permalink | Reply to this

### Semantics Sidelined

For the MTBlogPing ;), Sorry I’m not using Movable Type…

That’s OK. As I said in the above-linked post, I deliberately didn’t bullet-proof the <MTPingBlogName>. It took many months before an invalid one came along. It’s bullet-proofed now!

You make a valid point that, in any given language, the definition of terms like “abbreviation” or “acronym” typically involve the mechanism by which they are constructed out of the underlying words of the language, rather than how they are spoken.

English provides a huge range of such mechanisms (“CVS”, “LAN” “Mr.”, “can’t”, “ain’t”, “HumInt”, “Inc.”, “lb”, “P2P”, “Blvd”, “radar”, “l33t”, …). Having two elements, <abbr> and <acronym> to describe this huge range of behaviours is woefully inadequate. Moreover, other languages have other mechanisms and, anyways, HTML is not English (or French or …).

Since <abbr> and <acronym> cannot usefully describe how abbreviations are constructed out of the underlying words of English (or any other human language) and, since HTML is, in any case, not about explicating the fine points of the grammar of any particular human language, using those HTML elements for this purpose qualifies as stupid in my book.

The only good reason for marking up certain strings of characters using these tags is to indicate their need for special treatment (by User-Agents, by Search Engines, …).

I’m glad that you agree that “presentation often conveys semantics.” And your example of the <q> element illustrates perfectly the perils of relying on CSS to convey essential semantic information. Many User-Agents either don’t support the <q> element, or don’t provide a default rendering for it.

So, disable CSS and … poof! … there go your semantics. (I don’t need to point out that the WCAG, as well as common sense, require that a document be usable with CSS disabled.)

Last but not least: My comment was a personal point of view even if I’m the Conformance Manager of W3C.

I guess I should be careful what I write. You never know who might be reading.

Posted by: Jacques Distler on September 14, 2003 12:04 PM | Permalink | Reply to this
Weblog: Weblog about Markup & Style
Excerpt: Yes, that is correct just 14 little bytes (even Google wouldn't notice...) and your site will be easier to navigate. And you won't have to add them on you most frequent visited page. If you homepage is the most visited...
Tracked: September 15, 2003 1:03 PM
Weblog: TwoCentsWorth.com
Excerpt: I've been tossing a number of links aside for the past several weeks now. Each one was something I thought about posting a commentary on, but never got around to. I don't think I...
Tracked: September 17, 2003 10:40 PM

### Re: <abbr>, <acronym>, Accessibility & Automation

I’m the author of Acronym and havn’t see the patch untill now.

Do you mind sharing your acronym.db and abbr.db files?

This way I could incorporate some of the changes into Acronym?

Posted by: Henrik Gemal on October 29, 2003 7:19 AM | Permalink | Reply to this
Read the post Friday Feast #64: Abbreviations, Acronyms, and Shortened Words
Weblog: Brainstorms and Raves
Excerpt: I've noticed that a growing number of websites are providing tooltips and styles for abbreviations and acronyms within content. Later versions of browsers support the abbr and acronym elements, with the exception of Internet Explorer unfortunately not ...
Tracked: January 3, 2004 3:49 PM

### Re: <abbr>, <acronym>, Accessibility & Automation

2 questions:

firstly, you mention in the text “Since it’s reasonably well-written, getting it to add a title to only the first instance of an acronym/abbreviation in each blog entry was also fairly straightforward.”
Is this implemented in the 0.7 patched version, as it does not appear to be on my installation.

secondly, does this add <span class=”caps”> round abbreviations? I couldnt see anything that would do this in the pl script, but it just stared happening after I pateched the acronym.pl file

Posted by: Donald Noble on April 27, 2004 5:09 PM | Permalink | Reply to this

### Re: <abbr>, <acronym>, Accessibility & Automation

Is this implemented in the 0.7 patched version?

As I said above, my patch is against the (current) 0.7 version.

secondly, does this add <span class="caps"> round abbreviations?

What the \$%#@ is <span class="caps">?

Posted by: Jacques Distler on April 27, 2004 6:15 PM | Permalink | PGP Sig | Reply to this
Weblog: CEFA::Blog
Excerpt: なんと夢のようなプラグインでしょうか。 嘗てはお手製の辞書（XMLにて製作）を使...
Tracked: December 1, 2004 7:06 PM

### Re: <abbr>, <acronym>, Accessibility & Automation

re: CDM = Cold Dark Matter
No No No
As any Brit would tell you CDM is short for Cadbury’s Dairy Milk as in CDM & bar :-)

Posted by: Mark Pearson on December 20, 2004 11:58 AM | Permalink | Reply to this

### CheckEngine USA

Accessibility is very important. So important that the federal government is requiring websites to be user-friendly for disabled people (including blind). Business opportunities will arise from this legislation.

Posted by: Brett S. on May 19, 2006 2:03 PM | Permalink | Reply to this

### Re: <abbr>, <acronym>, Accessibility & Automation

Being a non native English speaker I also share the sentiments of “the concept of abbr and acronym has a strong internationalization problem depending on the language, you are using…”

Since and don’t normally depict how abbreviations are formed out of the underlying words of the English language

Alexi

Posted by: Provillus on February 15, 2010 4:58 PM | Permalink | Reply to this

Post a New Comment