[Blindmath] PDF to xhtml

White, Jason J jjwhite at ets.org
Wed Nov 25 14:17:23 UTC 2015


> On Nov 25, 2015, at 07:14, Abi James via Blindmath <blindmath at nfbnet.org> wrote:
>
> As suggested getting hold of the LaTeX source files is much better as PDFs a
> generally difficult to extract back to an accessible format. If you do try
> to use a convertor of PDF  to html check how they deal with the equations.
> If the math is converted to MathML or LaTeX it will be editable and probably
> accessible (depending on  the browser and A.T. you are using) but many
> convertors default to image output for equations.

In my experience, PDF files generated by LaTeX don’t preserve the Unicode values of the characters used, including mathematical symbols. Instead, what you get upon converting to text reflects the position of the character glyph in the font rather than the Unicode code point.

The result is that non-ASCII characters such as mathematical symbols are unreadable upon conversion.

LaTeX does not create tagged PDF, though I understand that ConTeXt can. Tagged PDF is much more accessible, as it preserves document structures. Adobe tools under Microsoft Windows can interpret the structure, and there was a project underway to enable Evince under Linux to do likewise, but I don’t know how far this software development effort has progressed. The Preview PDF reader under Mac OS X cannot recognize the structural tags.

The best solution is always to obtain the LaTeX source file and work from there, rather than to attempt to convert a PDF file.

I find LaTeX to be a superb format for writing and editing documents. If conversion to a greater variety of formats (HTML, EPUB 3, Microsoft Word, etc., in addition to PDF) is desired, I usually write the document in Markdown format and use Pandoc (http://www.pandoc.org/) to convert it. Pandoc supports a subset of LaTeX mathematics, but I haven’t had any experience with it as I am not writing mathematical material.


________________________________

This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited.


Thank you for your compliance.

________________________________


More information about the BlindMath mailing list