[blindLaw] ABYY FineReader Stability with Jaws?

Gerard Sadlier gerard.sadlier at gmail.com
Mon Aug 5 22:46:22 UTC 2019


I have found paragraph numbers do work reliably in documents that have
been scanned in.

As noted, there are good reasons that people send their documents as
scanned pdfs, I do so myself. That is by design, it is not that I am
creating the document incorrectly.

G

On 8/5/19, Laura Wolk <laura.wolk at gmail.com> wrote:
> Rahul has hit it on the head, as usual, especially the part about
> decoupling the issues with image-based versus text-based files.  In a
> searchable pdf, you can copy and paste directly as well.  Except that
> you have the added benefit that formatting comes with it, so you don't
> have to go back and reitalicize things, etc.  And we need to preserve
> page numbers, as that is how things are cited in the U.S., which in my
> understanding are not preserved in text documents.
>
> On 8/5/19, Rahul Bajaj <rahul.bajaj1038 at gmail.com> wrote:
>> Relying on para numbers is also often not a viable solution. When
>> documents
>> are converted from pdf to word, para numbers often get lost in
>> translation,
>> such that they do not appear in the word version or appear irregularly.
>> This
>> also happens when you run them through JAWS’ native OCR functionality in
>> PDF
>> documents. I often had to sit with a sighted colleague and identify what
>> para number corresponded to what piece of information and then manually
>> insert this in the converted document, to be able to cite them correctly
>> in
>> my own written submissions.
>>
>> It would be instructive, imo, to decouple the issues with image-based
>> pdfs
>> and those with searchable pdfs. The former are caused by the pdf maker
>> not
>> saving the document the correct way, as I understand it. The latter flow
>> from a failure on the part of Adobe and Visperro to ensure that their
>> apps
>> work well with each other.
>>
>> When one has a meeting with a partner or a client in, say, 3 hours and a
>> 250-300 page file to negotiate, converting it into word documents is not
>> a
>> viable solution. I mostly do open with>word 2016 these days for most
>> pdfs.
>> But it sometimes does not work, inasmuch as the end product is
>> inaccessible.
>> When this happens, if the doc is under 100 pages or so, I email it to
>> myself
>> and view it as an html document.
>>
>>
>>
>>
>> Best,
>> Rahul
>>
>> Sent from my iPhone
>>
>>> On Aug 6, 2019, at 3:17 AM, Gerard Sadlier via BlindLaw
>>> <blindlaw at nfbnet.org> wrote:
>>>
>>> Laura
>>>
>>> Briefly, as noted in my previous emails, I convert to text files, not
>>> to pdfs. I do this as follows:
>>> 1. I get the pdfs in a folder.
>>> 2. I select them and go to omnipage on the file menu and click convert
>>> to
>>> text.
>>> 3. My text files appear.
>>>
>>> It is a very simple process. Usually this takes a matter of a minute
>>> or 2. If you had 20 substantial files, it might take 5 minutes, maybe
>>> 10 minutes for 20 very large files.
>>>
>>> I've been doing this for years and I have found it effective.
>>>
>>> Converting to text files has the added benefit that you can copy and
>>> paste from the other side's documents.
>>>
>>> In this jurisdiction, legal documents such as submissions (what you
>>> would call briefs) affidavits etc. are generally numbered by paragraph
>>> and that helps a lot. However, I am high-lighting that on occasion you
>>> won't get the page numbers and that is a problem when it arises.
>>>
>>> I'm sorry if that's not helpful.
>>>
>>>
>>>> On 8/5/19, Laura Wolk <laura.wolk at gmail.com> wrote:
>>>> Why shouldn't we insist on Adobe creating a better OCR solution?  Also,
>>>> the
>>>> issues being raised have to do with Jaws losing focus, inexplicably
>>>> jumping around the document, and not navigating properly as you scan
>>>> by paragraph or line.  Unless i'm mistaken, that has nothing to do
>>>> with OCR capability.  And if it does, then it only highlights that the
>>>> solutions provided by Jaws and Adobe are inadequate for the task,
>>>> since people are using these functions and still getting rather paltry
>>>> results.  It also doesn't make sense, given that people have reported
>>>> a marked downturn in Adobe's performance as of late with Jaws.  So
>>>> perhaps the issue is we're simply discussing two different things.
>>>>
>>>> As for my previous email, thank you for bringing your interpretation
>>>> of it to my attention.  However, I've reread it, and I don't think
>>>> there is anything disrespectful about its tone, except perhaps that I
>>>> didn't give exact concrete data about how many files I'm dealing with
>>>> and how frequently assignments are given and expected back the same
>>>> day.  The suggested overnight solution is just not viable.  And I
>>>> don't think it would be viable in many competitive high-paced working
>>>> environments.   (Again, I am in no way commenting on your own personal
>>>> working environment.  I'm making a general statement based on the
>>>> general proposition that, oftentimes in high-paced environments,
>>>> associates are given hours not a full business day or overnight to
>>>> complete a task, and these minutes add up).  I hope that clarifies my
>>>> point.
>>>>
>>>> Angie, I do get your point, and thank you.  My point is, if you take
>>>> those minutes, let's just say even five minutes per document, or even
>>>> 3, and then say you have to review twenty files.  Those minutes add up
>>>> quite quickly.  Just like the track change issue.  Sure, if you've got
>>>> ten to review, no big deal.  When you're talking about 600 changes, it
>>>> matters
>>>> a whole lot more.  So I completely hear you, but just like at some
>>>> point the "a little extra time" approach to track changes veers into
>>>> "I can't actually or I am severely struggling to complete my duties"
>>>> territory, so too with converting pdfs.  And again, my main point
>>>> doesn't even have to do with converting.  It has to do with the fact
>>>> that I have a pdf that Adobe is reading just fine.  It's not
>>>> presenting as blank or empty.  But in the course of reading, Jaws
>>>> becomes frenetic and starts jumping all over the document.  This
>>>> happens with multiple documents, and others have stated they
>>>> experience the same.  It seems like there must be a solution to this
>>>> that does not involve saving to another doc type (since Adobe seems
>>>> able to read it), but only involves stabilizing Adobe and Jaws.
>>>>
>>>> For that reason, I'm still interested in your and Ger's experience
>>>> using Nuance to batch
>>>> convert.  Did you or do you have the issue of Jaws jumping around when
>>>> you do not use Nuance to first convert the pdf to another pdf?  Does
>>>> Adobe appear more stable when you're using a nuance-converted pdf?
>>>> Ger, it sounds like you are not having the issues that I am
>>>> describing.  So if you are using Nuance to ocr and tag documents and
>>>> those documents are then stable in Adobe, that would be a wonderful
>>>> and welcome development!
>>>>
>>>> Thanks,
>>>> Laura
>>>>
>>>>
>>>>
>>>>> On 8/5/19, Gerard Sadlier via BlindLaw <blindlaw at nfbnet.org> wrote:
>>>>> Angie’s comments are well made. I would only add that if the document
>>>>> contains paragraph numbers those are a good and reliable reference
>>>>>
>>>>> On Mon 5 Aug 2019 at 20:52 Angela Matney via BlindLaw
>>>>> <blindlaw at nfbnet.org>
>>>>> wrote:
>>>>>
>>>>>> Laura, one quick observation: Without addressing the need for
>>>>>> precision
>>>>>> when, say, referring to line numbers or making other references to
>>>>>> the
>>>>>> PDFs
>>>>>> that are required for your job, I just want to add that converting
>>>>>> files
>>>>>> to
>>>>>> another format may not take the amount of time one might expect,
>>>>>> particularly if you use a mainstream solution. I don’t know how long
>>>>>> it
>>>>>> takes Kurzweil to convert a PDF because I don’t use that product. In
>>>>>> the
>>>>>> past, I’ve used Abbyy Finereader; nowadays, I use Nuance Power PDF.
>>>>>> Each
>>>>>> of
>>>>>> these can OCR a file of several hundred pages in a matter of minutes
>>>>>> and
>>>>>> give good results, barring handwriting or other unusual attributes of
>>>>>> the
>>>>>> file. I’m in no way saying that we don’t need a better way to access
>>>>>> PDF
>>>>>> files directly; I’m just suggesting that converting a PDF to Word or
>>>>>> another format of choice isn’t necessarily a process that takes hours
>>>>>> and
>>>>>> hours.
>>>>>>
>>>>>> Having said this, it’s certainly true that if you have many, many
>>>>>> files,
>>>>>> there will be a good amount of time involved. This has happened to me
>>>>>> before (involving due diligence for a merger and documents from the
>>>>>> other
>>>>>> side’s data room). But I’m not sure that the method I used ultimately
>>>>>> took
>>>>>> more time than I would have spent performing OCR on the PDFs, which
>>>>>> would
>>>>>> have had to be done in any case because they were images.
>>>>>>
>>>>>> I can’t comment on using any of these applications to read PDFs
>>>>>> directly.
>>>>>> In my mind, converting files to another format is analogous to
>>>>>> printing
>>>>>> vs
>>>>>> reading the PDF on-screen, and while it’s not perfect, it generally
>>>>>> works
>>>>>> for my purposes.
>>>>>>
>>>>>> I hope you’re able to find a solution quickly.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Angie
>>>>>>
>>>>>> Angela Matney, CIPP/US
>>>>>> Attorney at Law
>>>>>> [Loeb & Loeb LLP]<http://www.loeb.com/>
>>>>>> Loeb and Loeb LLP
>>>>>> 901 New York Avenue NW, Suite 300
>>>>>> <https://www.google.com/maps/search/901+New+York+Avenue+NW,+Suite+300?entry=gmail&source=g>
>>>>>> East | Washington, DC 20001
>>>>>> Direct Dial: 202.618.5038 | Fax: 202.403.3407 | E-mail:
>>>>>> amatney at loeb.com
>>>>>> <mailto:amatney at loeb.com>
>>>>>> Los Angeles | New York | Chicago | Nashville | Washington, DC | San
>>>>>> Francisco | Beijing | Hong Kong | www.loeb.com<http://www.loeb.com/>
>>>>>>
>>>>>> ________________________________
>>>>>> CONFIDENTIALITY NOTICE: This e-mail transmission, and any documents,
>>>>>> files
>>>>>> or previous e-mail messages attached to it may contain confidential
>>>>>> information that is legally privileged. If you are not the intended
>>>>>> recipient, or a person responsible for delivering it to the intended
>>>>>> recipient, you are hereby notified that any disclosure, copying,
>>>>>> distribution or use of any of the information contained in or
>>>>>> attached
>>>>>> to
>>>>>> this transmission is STRICTLY PROHIBITED. If you have received this
>>>>>> transmission in error, please immediately notify the sender. Please
>>>>>> destroy
>>>>>> the original transmission and its attachments without reading or
>>>>>> saving
>>>>>> in
>>>>>> any manner. Thank you, Loeb & Loeb LLP.
>>>>>> ________________________________
>>>>>> _______________________________________________
>>>>>> BlindLaw mailing list
>>>>>> BlindLaw at nfbnet.org
>>>>>> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
>>>>>> To unsubscribe, change your list options or get your account info for
>>>>>> BlindLaw:
>>>>>>
>>>>>> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/gerard.sadlier%40gmail.com
>>>>>>
>>>>> --
>>>>> null
>>>>> _______________________________________________
>>>>> BlindLaw mailing list
>>>>> BlindLaw at nfbnet.org
>>>>> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
>>>>> To unsubscribe, change your list options or get your account info for
>>>>> BlindLaw:
>>>>> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/laura.wolk%40gmail.com
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> BlindLaw mailing list
>>> BlindLaw at nfbnet.org
>>> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
>>> To unsubscribe, change your list options or get your account info for
>>> BlindLaw:
>>> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/rahul.bajaj1038%40gmail.com
>>
>




More information about the BlindLaw mailing list