[BlindMath] Accessibility of Latex to PDF

Neil Soiffer soiffer at alum.mit.edu
Mon Jan 11 21:38:17 UTC 2021


To maybe clear up some confusion, let me add a little bit on the technical
side about PDF that may help explain why it is often not accessible...

A PDF document is actually structured as a tree, with the main nodes being
nodes for each page. The design allows PDF renders such as Adobe Reader to
open a 200 page book quickly and go to any page quickly. Each page contains
a set of very low level commands that draws graphics and text on the page.
Those commands are often not in the reading order for the document.

Another node in the PDF tree is one that represents a tree and is referred
to as the "structure tree". If it is used, it essentially imposes order on
the document and is the key to accessibility. It maps mosty to HTML in a
straightforward way and points into the various pages to the actual text of
the document. As an example, the structure tree has nodes for headings,
lists, paragraphs, etc. AT uses the structure tree to read the document. If
there is no structure tree, the PDF is likely to be gibberish if read with
an AT. A PDF document with a structure tree is often referred to as "tagged
PDF". Well tagged PDF should be as accessible as well tagged HTML. At the
moment, well tagged PDF is not the norm.

PDF continues to evolve and in 2017, PDF 2.0 came out. That specifically
included the MathML namespace as valid in PDF. The PDF/UA committee (the
one that says what an accessible PDF document is) is working on updating
their spec (ISO 14289) to PDF 2.0. That update includes what to do about
math. PDF/UA 1.0 says "add alt text" -- that requires the user to provide
the alt text. The update draft says "use MathML", which is something that
doesn't require user intervention. It will be a few years before the PDF/UA
final version comes out (ISO has a lot of rules about process).
NVDA+MathPlayer has had the ability to read PDF tagged with MathML ever
since it was released, but other than a few sample documents, no tools
generate it at the moment.

This discussion thread has been about TeX. Unfortunately, the tools that
generate PDF from TeX do not generate a structure tree and so the resulting
TeX is not accessible. The good news is that a number of people are working
on making the PDF generated from TeX accessible. There are some
experimental packages that create a structure tree. I recently tried a few
and unfortunately NVDA didn't like them -- apparently they didn't get all
the details down right. I'm sure that will improve in the (hopefully near)
future. None of the packages do anything about making the math accessible,
although I think one may have a way to add alt text.

I hope that clears up some questions about PDF accessibility. Accessible
math on the web has become the norm in the last few years. I hope in a few
years, the same will be able to be said for PDF. In the meantime, I agree
with those who have said get the original TeX, get an HTML version, or get
a Word version.

Neil Soiffer


More information about the BlindMath mailing list