[Blindmath] Question about converting Math Book Content
r_akshi_tgk at yahoo.com
Wed Jan 27 11:30:08 CST 2010
If I remember correctly, the object in that particular case was to get InftyReader to convert the PDF to HR-TeX, as opposed to it placidly sitting on the computer. While there were many errors, the alternative was getting nothing at all.
Everyone has their own point beyond which they can’t deal with badly OCRed documents. In many cases, I have received what I like to call PIGO (PDF In, Garbage Out).
I think that there are 2 conditions that need to be fulfilled if one plans to edit one’s own books to get reasonably clean results.
1. Know Thy Material- If you are following your class, or if you are generally aware of the Mathematics involved in the text that needs to be edited, you have a better chance of cleaning up the mess. For example, if I know that summation is represented as \sum_i=1^n, and not as \sum_l=1^n, as InftyReader wants me to believe, I will go on to edit the text according to my understanding. I may even use the Find and Replace function to change all instances of the wrong symbol. Similarly, if I know that the presence of \iota in the text is wrong since it is primarily used to represent complex numbers which do not form a part of my study material, I’m going to replace it with i.
2. Know thy Symbols- Well this is mainly a derivation of the above condition. InftyReader quite often gets confused between similar looking symbols. So, \theta becomes 0, \beta becomes B, \omega becomes W, or vice-versa.
Besides, if you have the OCRed PDF, and all you are doing is conversion, you could go back to the PDF to check the symbols. If it is an A, your screen reader will read it out, if it is an alpha, your screen reader won’t.
But all this isn't really necessary if you aren’t in desperate situations like me.
"Faced with the choice between changing one's mind and proving that there is no need to do so, almost everyone gets busy on the proof."
~ John Kenneth Galbraith
--- On Wed, 1/27/10, John Gardner <john.gardner at orst.edu> wrote:
> From: John Gardner <john.gardner at orst.edu>
> Subject: Re: [Blindmath] Question about converting Math Book Content
> To: "Blind Math list for those interested in mathematics" <blindmath at nfbnet.org>
> Date: Wednesday, January 27, 2010, 10:47 AM
> I will be very surprised if this
> works. The reason that Infty Reader requires higher
> resolution than other OCR is that it cannot depend on things
> like characters all being on one line. In math,
> characters can be distributed over a wide area. There
> are also quite subtle characteristics that distinguish, for
> example, a lower case alpha from a lower case a or italic
> a. And there are a number of big characters that might
> look like a collection of alphabetic characters.
> Recognition accuracy degrades rapidly when parts of
> characters begin to touch when they are, in principle,
> separated by white space. So artificially increasing
> resolution will probably not correct those "touches".
> On 1/26/2010 4:17 PM, Pranav Lal wrote:
> > Hi all,
> > ABBYY Fine Reader can increase the resolution of
> scanned images. This has
> > not worked for me but it is a long time since I tried
> it with Infty Reader.
> > Use abbyy fine reader to increase the images
> resolution to 600 DPI and then
> > save it as an image. Send the saved image to Infty
> > Roopakshi, I remember us discussing this technique.
> Have you been able to
> > get it to work or is my organic ram playing tricks
> > Pranav
> > _______________________________________________
> > Blindmath mailing list
> > Blindmath at nfbnet.org
> > http://www.nfbnet.org/mailman/listinfo/blindmath_nfbnet.org
> > To unsubscribe, change your list options or get your
> account info for Blindmath:
> > http://www.nfbnet.org/mailman/options/blindmath_nfbnet.org/john.gardner%40orst.edu
> Blindmath mailing list
> Blindmath at nfbnet.org
> To unsubscribe, change your list options or get your
> account info for Blindmath:
More information about the Blindmath