[Blindmath] Automated production of braille math -- Step One: Digitization

Susan Jolly easjolly at ix.netcom.com
Sat Jul 16 19:07:14 UTC 2011


Several people have recently pointed out that producing braille math 
textbooks
can be very expensive and time-consuming. I'd like to explore some of the 
reasons.

There can be two steps to automated production of braille.  Unless the 
original document
is "born digital," the first step is converting a paper document to
an electronic format.  Once an electronic document is available, it is 
potentially possible to automatically
transcribe that document to braille. I'm going to discuss digitization in 
this email and address automated transcription
later. I suggested that any feedback recognize this distinction as well.

(Note that my impression is that many sighted braille transcribers still 
find it easier to mentally transcribe and directly enter braille math rather 
than using automated tools. I'm guessing this approach accounts for part of 
the current cost.)

There is no getting around the fact that it does cost something to convert a 
technical document
available only in printed on paper form to an accurate electronic format, be
it LaTeX or MathML or some other format.

However, there are many commercial organizations which are experts in this 
process and
there are many places where one can outsource the conversion of paper 
documents
to electronic format. One good company is River Valley Technologies, located
in Kerala, India, which uses tex4ht in their workflow.  (By the way, this is 
the
same software that Michael W. recently posted instructions on the use of and 
River Valley
is probably the world expert in tex4ht.)  http://river-valley.com/
Another good company, which is here in the US, is Data Conversion 
Laboratory.
http://www.dclab.com/

By the way, I've named particular commercial organizations for two reasons. 
First,  I've gotten the
impression that the poor quality of some NIMAS files may have led to the
impression that producing accurate electronic documents isn't feasible
whereas I don't think that is true. Second, the names could be a useful 
starting point for research by producers of accessible materials.

An alternative to outsourcing is to do one's own scanning and to use 
InftyReader for math OCR to either LaTeX or MathML. I don't have any 
experience with InftyReader so don't have any estimates as to how long it 
would take a properly-trained person to prepare an acceptable electronic 
document using this software.  (I realize that different content would take 
different times.)

I should point out here that the efficiency of digitization has  recently 
been increasing dramatically because so many libraries are digitizing their 
entire collections.  There are now scanners that can automatically scan 
entire books either without damaging valuable fragile books or by first 
chopping a paper copy that has no intrinsic value.  There is also commercial 
software such as the oXygen XML editor that makes it easy to add and edit 
markup as well as compare marked-up files. (A copy of this software for 
academic or non-commercial use can be obtained for $64.)

To summarize, there is a growing amount of expertise in converting printed 
technical documents to electronic format that is likely not being leveraged 
in the production of braille materials. So this issue needs to be addressed. 
I don't see much value to addressing the cost of artifacts of less than 
optimal solutions.  For example, someone mentioned that a braille version of 
a book might be an older edition.  However, if there were an accurate 
electronic source document for this older edition, it might be possible to 
update that electronic document more cheaply than to digitize the entire new 
edition of the printed book.

Note that there should be opportunities for cost-sharing among entities that 
might want to utilize a given electronic format for other purposes than 
braille production. If so, this has the potential to significantly reduce 
the contribution of the cost of digitization to the cost of braille 
production.

Susan








More information about the BlindMath mailing list