[BlindMath] Current strategies regarding accessible mathematics

Jonathan Godfrey A.J.Godfrey at massey.ac.nz
Mon Mar 14 08:03:38 UTC 2022

Awesome. I knew someone would deliver the goods for us.

I'd note that using images for math content with alt tags is a very old way to attempt to provide access. This is what Wikipedia used to do. 15 years ago, we thought this was awesome. The content was acceptable if the expression was simple, but beyond simple expressions, the content was then insurmountable for mere mortals. Embracing MathJax was a stellar move for our ability to use Wikipedia. N.B. the same could be said for numerous other sites, but Wikipedia is a very well-known site.

While I express a degree of comfort reading raw LaTeX, I'm not trying to complete postgraduate mathematics by manipulating LaTeX expressions. Raw LaTeX might be much more readable using a screen reader if the back slashes and curly braces were silent, but managing multiple sets of brackets, especially if they're nested is a situation some blind people have really struggled with. I use this example because it is a massive difference between how we as blind people process content while sighted people can scan a structure and quickly fill in the details. For the most part, the closest blind people get to the sighted way of digesting non-simple math expressions is limited to the most proficient of braille readers. FWIW, I am not one of those people.

Making use of MathML and MathJax gives a screen reader the chance to explore the structure of the expression. That is crucial.

I know efforts are being made to improve the pdf output from LaTeX, but progress is proving very slow. I accept that some people believe that making sure the pdf can have alt text tags means that it is more likely to get those tags in HTML, but to my way of thinking, putting effort into pdf is a poor use of developers' time.

I already used TeX4ht and found a way to change the alt text tags in the generated documents. It was an annoying work around, but it worked. My processing of several documents into HTML as the primary delivery format, with pdf available on request offers proof of concept. It seems  strange to me that the best the TeX development community wants for the pdf outcome is something we already have in HTML. To me, that means they will always be playing catch up because the tools that increase our access to HTML content are also moving ahead.

I am willing for people to say I'm biased, but it is twenty years since I was writing my PhD in LaTeX. In the time since, LaTeX has done nothing to improve my access to a pdf. I've had to use the best tools on offer to get HTML out of my LaTeX with quite a lot of success, and I've done that with very limited support. In contrast, when I started using markdown in late-2014, I was getting accessible HTML content almost immediately. OK, I had to do my own experimentation to find out what the best options were of those available for getting accessible math content, but I was then able to demonstrate my findings and perhaps to influence the uptake of MathJax in the R implementation of markdown (via a Conference presentation, 2016). 

At the same time I was moving to markdown, screen reader developers were improving their software's ability to make use of MathML and MathJax content. I do not need my career compromised by either the ongoing failure of  the pdf format, or the failure of screen reader developers to deliver the niche tools needed to deal with unusual tools/objects found  in STEM software/documents, but they have delivered on improved access to webpage content because that is so mainstream. 

I think the up and coming generation will embrace the tools that are delivering what they need now; they'll ignore the tools that are difficult to use in favour of things that work, and the .idea that they might have to wait years for a solution is a recipe for them to find a different career option. The time is right to get more blind people into mathematical disciplines, but they need quality resources that show them how easy it can be to read and more critically, write math content that everyone else is using.

The main difference is that as blind people we are at the mercies of authors of LaTeX files to learn how to improve the access of their output documents, whereas every author of R markdown is almost invariably creating accessible content because it is almost impossible for them to do otherwise.

I'm pleased to see several options for generating HTML from LaTeX under development. I chose TeX4HT because it was already installed with my miktex installation and was therefore ready to go. My miktex  was already set up to grab the additional tools needed to process my files. I only needed to learn how to subdivide the document into the style of website I wanted and to get the math content presented using MathJax. My biggest headache was that the editor tools being used didn't have an option to run the commands necessary to generate the pages. That meant writing batch files with the command lines in them. That was within my skill set, but it isn't  what many of my colleagues would manage.

My request to anyone developing LaTeX to HTML solutions is to make it easy for all authors to use their tools. Offer the authors the chance to make HTML and maybe they'll choose to make HTML. After all, the idea of making a pdf from LaTeX is a 21st century activity. We used to make postscript files and convert those to pdf before we went straight from LaTeX to pdf.

All the best,

-----Original Message-----
From: BlindMath <blindmath-bounces at nfbnet.org> On Behalf Of Brian Dunn via BlindMath
Sent: Monday, 14 March 2022 7:23 pm
To: blindmath at nfbnet.org
Cc: Brian Dunn <bd at bdtechconcepts.com>
Subject: Re: [BlindMath] Current strategies regarding accessible mathematics

On Sun, 13 Mar 2022 19:51:27 +0000
Jonathan Godfrey via BlindMath <blindmath at nfbnet.org> wrote:

> There are a variety of tools to take LaTeX source and create HTML. 
> I've had good use from TeX4HT but others can point you to other 
> solutions.

The following are still being developed, and making progress in ease of use and the support of more and more LaTeX packages:

TeX4ht: LaTeX to HTML or EPUB with MathML or MathJax.  I think there is also some support for creating word processor files.  Around 450 packages and classes have programmed support, and probably supports a large number of other packages as-is.

Lwarp: LaTeX to HTML with MathJax or SVG math images.  Can be used with Calibre to create EPUBs as well, but it isn't as automatic as TeX4ht would be.  Lwarp can also create a simplified HTML ready for copy/paste into a word processor.  Around 580 packages have programmed support, plus 10 or so common classes and a decent number of world language classes.  Has additional MathJax emulation for 90 packages.  Also supports a large number of other packages as-is.

LaTeXML: LaTeX to HTML with MathML or PNG math images.  Around 40 classes and 350 packages are supported.

In the above, "programmed support" means the document will compile with that LaTeX package or class, and the conversion either ignores the package as being irrelevant for HTML output, adapts it for HTML or MathJax or MathML, or perhaps produces an image of how the piece of code compiles and displays the result.  But this output may not be optimized for a blind reader.

For many objects, such as a TiKz or chemistry image, the result will be an image with an ALT text of some sort, which in many cases will only be an almost useless generic name like "image". It is up to the author to provide a meaningful ALT tag in this case.

For simple inline math, Lwarp and perhaps the others can produce an SVG image with an ALT tag containing the LaTeX code for the math.  For complicated math, the LaTeX code can become unreasonable or impossible to include as an ALT tag, and the user should provide an override description.

For other objects, such as a chemistry macro, Lwarp includes the source in the ALT tag if possible, but it is not always possible.  For example, a plain text copy/paste of parts of a document using the chemmacros package contains lines like:

\ox{.5,Br2} \ch{"\ox {1/3,I}" {}3+}
13C-NMR (100 MHz)
(math image)

The above is what the text to speech reader would say.  Someone who knows chemistry and the macros for the package may understand much of this, while the rest of it would have to be described by the author.

LaTeX has recently added support for ALT tags for images in PDF documents.
 As authors start using these, they will automatically be included in the HTML conversions as well.  There is a lot of other additional work being done for improving the accessibility of PDF documents.  Hopefully much of this work will also be useful for HTML conversions.

And if anyone tries the Lwarp package and has suggestions, let me know and I'll try to improve it further.  Likewise for TeX4ht or LaTeXML.  Each of those teams would probably like to hear from you.


BlindMath mailing list
BlindMath at nfbnet.org
To unsubscribe, change your list options or get your account info for BlindMath:
BlindMath Gems can be found at <https://apc01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.blindscience.org%2Fblindmath-gems-home&data=04%7C01%7Ca.j.godfrey%40massey.ac.nz%7C2004dbd84da145fd7c6c08da05833fa1%7C388728e1bbd0437898dcf8682e644300%7C1%7C0%7C637828358521679141%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=4m%2BDH3HKv8ma7pHDsCa%2BzAmVfB42bEXZ0ZdBVtqGLdk%3D&reserved=0>

More information about the BlindMath mailing list