[Blindmath] Maths on the web (yet again)

Tue Jul 27 08:07:14 UTC 2010

There's a new question and answer site starting up focussing on TeX and LaTeX.
As was fairly predictable, one of the first questions is about putting maths
on the web.  Having lurked here for a while, and taken part in the discussion
on Terry Tao's blog a short while ago, I thought I'd try my hand at answering
this question.  However, as I'm neither blind nor an expert on such matters,
there are no doubt things that I've gotten wrong.

Since the website in question is at the moment in a private mode (anyone can
read the questions and answers via
http://stackmobile.com/site.php?site=tex.stackexchange, but only the original
people who signed up can post stuff), and I've no idea how accessible the site
it, I'm copying my answer below.  I will happily correct any mistakes I may
have made, or emphasis that I've mislaid, or add anything that members of this
list feel should be said.

The markup language for this is Markdown.  I don't know how email readers will
cope with that so here's a quick explanation of the key points.  Firstly, list
numbering is automatic so all my list entries start with a 1.  Secondly,
emphasis and bold are done by surrounding the word or text in underscores or
asterisks.  Thirdly, links are done like this: [link text](url).  Fourthly,
headers are done using hashes.

My answer now follows:

When embedding mathematics into a webpage, there are two primary questions:

1. What format should be used to display it?
1. Where should the conversion be done?

In my opinion, each of these has a definite answer and a different solution should only be used if the optimal solution really cannot be done.

1. **MathML**.  Reasons:
   1. It is the *only* **accessible** way of doing this.  Putting the original LaTeX in an `alt` tag on an image is not accessible - it relies on the recipient being able to understand raw LaTeX source code (more on this in a moment).  Also, not all of those requiring accessible webpages use screen readers, some simply need to enlarge the page.
   1. It is **styleable** (not sure if that's a word).  Since MathML is part of the XHTML suite, it can be styled in the same fashion as the rest of the document (namely, via CSS), so the resulting display is far more harmonious than any other (try changing the background colour to something easier on the eyes at one of those wordpress blogs and you'll see what I mean).
   1. It is **small**.  A quick test on my system with 515 simple files that I happened to have lying around showed that PNGs weighed in at 175kB whilst the MathML equivalents were a shade under 60kB.  The PNGs were not large resolution, for example the PNG containing the Zeta symbol was a 9x13 image.

1. **Server-side**.  Reasons:
   1. It is **small**.  Instead of sending both the source _and_ the instructions on how to compile it, you just send the result.
   1. It is **reliable**.  You can easily check that what you want the person to see is what they should see.  In particular, a javascript solution relies on two things being correct: the javascript script _and_ the implementation of javascript in the browser.  MathML just relies on the MathML implementation in the browser.
   1. It is **fast**.  With server-side caching, you only need to process the mathematics once and then it's done.
   1. It is **verifiable** (similar to reliable, I guess).  I don't fully understand the differences between the _types_ of spec that w3c produce, but MathML is certainly a recommendation.  Even though browser support is variable, the variations are known because they can be measured using the open standard, and thus can be taken into account.

Server-side MathML is the optimal solution.  Of course, it's not always possible and then other solutions are useful.

There are various standard arguments against using server-side MathML and other myths about mathematics in webpages that are worth taking a minute over.

###Myths###
1. Sending the raw LaTeX code in an alt tag makes images accessible.

   When people say this, they mean that they can read `$a^2 + b^2 = c^2$` and understand it.  Try them on something a little more complicated and you'll soon see that this is complete rubbish.  For example, try having someone **read out** the following to you: `$\begin{array}\ell^0(\mathbb{R})&\;\mapsto&\;\ell^2(\mathbb{R})\\\downarrow&&\uparrow\\L^2(\mathbb{R})&\subseteq\,&L^\infty(\mathbb{R})\end{array}$`.  Of course, there's going to be people who will say, "_I_ can understand that!" but _that's not the point_.  You write a webpage for other people and the more complicated the LaTeX, the fewer the number of people who can instantly read it.

1. MathML is badly supported.

   This is the classic chicken-and-egg.  MathML support is absolutely fine in Firefox, in IE with the MathPlayer plugin, and in Amaya (what's that, I hear you cry!).  Plus there are groups working on it for Opera and WebKit who just _need a little encouragement_!  Sending them an email saying, "I love your browser but until it has proper MathML support then I can't use it" would provide them with a little more motivation.  Of course, there are bugs in the implementations in Firefox and the others, but those are _known_ and so can be worked around.

1. MathML requires documents to be valid XHTML.

   Actually, this isn't a myth.  It's absolutely true.  But surely your pages were valid to begin with!  I'm a mathematician and my ideal document is one that _cannot_ be misunderstood.  That's impossible, so I try for the lesser goal of where any misunderstanding can be laid at the door of the person reading it rather than me.  MathML, as it's an open standard, allows me to reach that goal on webpages - at least technically, the contents are more variable!

Finally - on this part - for those that _still_ worry about Joe Blogs (or Ola Nordmann, to be geographically correct) not being able to read your webpage due to using an old version of IE and refusing to install plugins, it is actually possible to have two versions of the mathematics on your server and send MathML to those that can see it and PNGs to those that can't, thus getting the best of both worlds.

What about implementation?  Well, there you're in luck.  [iTeX](http://golem.ph.utexas.edu/~distler/blog/itex2MML.html) can do it all, and in spades.  iTeX is a fast c++ program that converts a subset of LaTeX mathematical language into MathML.  The original package comes with bindings for ruby, and I've extended this to PHP, Perl, and Python.  By combining it with other packages, in particular [svgmath](http://grigoriev.ru/svgmath/) or [gtkmathview](http://helm.cs.unibo.it/mml-widget/), it is possible to further convert the MathML to an image for broken browsers.  (Contact me for these extensions; I haven't gotten round to writing them up yet - it's on my TODO list!)

For examples, see the [nlab](http://ncatlab.org) (pure MathML) and the [nforum](http://www.math.ntnu.no/~stacey/Vanilla/nForum) (MathML, SVG, or PNG depending on what browser you are using).