[blindlaw] Batch Recognition of OCR

Tai Tomasi ttomasi at driowa.org
Tue Sep 5 14:31:10 UTC 2017


This is similar to what I am looking to do. I am hoping that, as documents come into my office in inaccessible formats, they can be scanned to a common folder from which ABBYY FineReader can create accessible copies automatically. I will check out FineReader corporate as suggested in a previous message. 

Ms. Tai Tomasi, J.D.
Pronouns: she/her/hers
Staff Attorney



400 East Court Ave., Ste. 300
Des Moines, Iowa 50309
Tel: 515-278-2502; Toll Free: 1-800-779-2502
FAX: 515-278-0539; Relay 711
E-mail: ttomasi at driowa.org
www.driowa.org

Our Mission:  To defend and promote the human and legal rights of Iowans with disabilities

CONFIDENTIALITY NOTICE

This e-mail and any attachments contain information from the law firm of Disability Rights Iowa and are intended solely for the use of the named recipient(s). This e-mail may contain privileged attorney-client communications or work product. Any dissemination by anyone other than an intended recipient is prohibited. If you are not a named recipient, you are prohibited from any further viewing of the e-mail or any attachments or from making any use of the e-mail or attachments. If you have received this e-mail in error, notify the sender immediately and delete the e-mail, any attachments, and all copies from any drives or storage media and destroy any printouts.


-----Original Message-----
From: BlindLaw [mailto:blindlaw-bounces at nfbnet.org] On Behalf Of Gerard Sadlier via BlindLaw
Sent: Sunday, September 03, 2017 12:53 AM
To: Blind Law Mailing List <blindlaw at nfbnet.org>
Cc: Gerard Sadlier <gerard.sadlier at gmail.com>
Subject: Re: [blindlaw] Batch Recognition of OCR

Hi all

I'd like to set up Omnipage so that I could:
1. Paste say 100 pdfs into a folder;
2. Go away to allow Omnipage to process the documents; and 3. Return to find the 100 pdfs each processed into a text file or word document etc.

If this is possible could someone tell me how?

Thanks

Ger

On 9/3/17, Aser Tolentino via BlindLaw <blindlaw at nfbnet.org> wrote:
> Hello,
>
> Unfortunately, it looks like OmniPage is no longer packaged with K1000 
> after 14.09. Batch scanning is explained in the manual on Page 63. 
> I've pasted the relevant sections below.
>
> Batch Scanning.
>
> Batch scanning lets you scan a number of pages at once without 
> recognizing or reading them. Instead, you can store them as image 
> files, which are like snapshots of the original document, then perform 
> the recognition process later.
>
> Batch scanning saves time during the actual scanning process, as the 
> system does not recognize each page as it is scanned. Since the 
> recognition process is completely automated, Kurzweil 1000 can perform 
> this step while the system is unattended.
>
> To perform a batch scan using menus:
>
> 1. Open the Settings menu and choose Scanning. In the tab page that 
> appears, press TAB to go to the Mode option. Use the arrow keys to 
> choose Image Scanning Only, then press ENTER.
> 2. Place your document on the scanner. Open the Scan menu and choose 
> Start New Scan, or press the F9 key. Instead of scanning to documents 
> that you can read, the system scans to image files.
>
> 3. When you have finished scanning to image files, you can perform any 
> other tasks that you wish. You can change settings, read documents, 
> and even leave the Kurzweil 1000 altogether. However, you cannot read 
> the image files until the system has recognized them, as described in 
> the next step.
>
> 4. Open the Settings menu and choose Scanning. On the Scanning tab 
> page, press TAB to go to the Mode option. Use the arrow keys to choose 
> Recognize Image Files, then press ENTER.
>
> 5. Open the Scan menu and choose Start New Scan, or press the F9 key. 
> The system starts recognizing the image files in the Images folder, 
> one after another. As each image file is recognized, it is deleted. 
> Choose Start New Scan or press F9 again at any time to stop recognition.
>
> If you stop recognizing at any point, you should save the current 
> file. You can later reopen the saved file, return to Recognize Image 
> File mode, and begin recognizing again from that point.
>
> KOCRUtil for Automatic File Recognition.
>
> If you have a multi-core processor on your machine, you can use 
> KOCRUtil to recognize files and files in folders automatically and silently.
>
> If the OCR engine can keep the recognized data for the entire document 
> and then convert it, it will attempt to unify its formatting decisions 
> so that the final document is more consistent.
>
> Before using KOCRUtil, however, consider the tradeoffs. Corrections, 
> for example, will not be applied to each page. You won't be able to 
> edit or read the document as it is recognized. Bookmarks will not be 
> captured for PDF files. And the resulting document won't be in KES 
> format (though most of the choices will produce output that can be 
> converted to KES by opening them within Kurzweil 1000).
>
> To run KOCRUtil:
>
> 1.	Select a file, a list of files, or a folder in Windows Explorer.
>
> 2. Bring up the context menu, and select the appropriate menu item. 
> For a folder, that menu item would be "Recognize Images with 
> Kurzweil." For image files (either TIFF, PDF, JPEG, or PNG), you can 
> pick either "Recognize Images with Kurzweil Automatically", or 
> "Recognize Images with Kurzweil Interactively".
>
> When you recognize images automatically, KOCRUtil.exe will run without 
> bringing up a window. It will use current default settings to 
> recognize the selected image files, or to recognize all of the image 
> files found within the selected folder, and then exit. When it exits, 
> you will hear a wave file "KOCRUtil.wav," if that file exists.
>
> If you selected a folder and then activated the "Recognize Images with 
> Kurzweil" menu item, KOCRUtil would look for all image files within 
> that folder (but not, note within sub folders). These files would be 
> organized into one or more group. Files are in the same group if their 
> file names are identical except for digits. So, for example, 
> "Image001.tif", "Image2.tif", and "Image43.tif" are all in the same 
> group, but "Imagea3" is not. Groups of image files are sorted by their 
> name, recognized together, and output into one resulting document.
>
> Output file names are based on the name of the first image file in a 
> group of image files, along with an extension that is appropriate for 
> the output format. Depending upon settings, the output files can be in 
> the same folder as the image files, or can be sent to a specified folder.
>
> The default is for KOCRUtil to use FineReader Engine with English as 
> the only recognition language, creating an RTF file that will be 
> placed in the same folder as the image file or files.
>
> To exit KOCRUtil:
>
> Press Escape or TAB to the Exit button and press Enter.
>
> To change KOCRUtil settings:
>
> Either run KOCRUtil.exe without command line arguments, or use the 
> "Recognize Images with Kurzweil Interactively" context menu. This will 
> bring up KOCRUtil, which has a single dialog.
>
> The dialog controls are described below in tab order. Where 
> applicable, the mnemonic follows.
>
> Image Files group has a text box, ALT+I, where you can specify one or 
> more image files, separated by semicolons. There is also a Browse 
> button which brings up a file open dialog so you can select the 
> desired image files from your system.
>
> Output File group has a text box, ALT+O and a Browse button. In the 
> text box you specify the output file. Note that it can be blank, in 
> which case the output file name is constructed using the first image 
> file name. If no path is specified, the source folder will be used, or 
> the default destination folder will be used, depending on that setting 
> (see below). You can also click the Browse button to bring up a file 
> save dialog in which you can specify the output file.
>
> Format list box, ALT+O, lets you choose the format of the output file. 
> The list of possible formats changes depending on the recognition engine used.
>
> Note: As of October 2016, with a full install of K1000 V14.09 and 
> above the FineReader will be used.
>
>
> Details button, Alt+D brings up the Format Details dialog in which you 
> can change format settings. The dialog contains: a Layout list (Alt+L) 
> where you can opt to Retain Layout, Formatted Text, or Plain Text. 
> Next is the Paper Size list (Alt+P); choose Automatic, A3, A4, A5, 
> Letter, or Legal. The third list is labeled Pictures (Alt+C); choose 
> to Remove Pictures, Low Resolution (for Web), Medium Resolution (for 
> screen), High Resolution (for printing).
> Four check boxes follow the lists. You can opt to Keep Page Breaks 
> (Alt+G), Keep Line Breaks (Alt+N), Keep Text Color (Alt+T), and Keep 
> Headers and Footers (Alt+H). By default, Kurzweil 1000 keeps Formatted 
> Text for the layout, uses Automatic paper size selection, Removes 
> Pictures, Keeps Page Breaks, Text Color, and Headers and Footers, but does not Keep Line Breaks.
> These Format Details settings are retained for future sessions until 
> you change them again.
>
> Recognition Engine list box, ALT+R. Choose the recognition engine, 
> FineReader Engine or OmniPage Engine.
> Note: As of October 2016, the OmniPage Engine will no longer be 
> available with a full install of Kurzweil 1000 V14.09 and above.
>
> Recognition Languages list view, ALT+L. Check one or more of the 
> recognition languages. The list changes depending on the recognition 
> engine.
> Note: As of October 2016, with a full install of K1000 V14.09 and 
> above the FineReader language list will be used.
>
>
> Start Recognition button, ALT+S. Use it to start recognition if 
> everything else is set up properly.
>
> The next three controls are in a group box labeled Default Destination.
>
> Use Source Folder check box, ALT+U. If set, the folder of the image 
> file will be used to specify the default destination folder (i.e., the 
> folder used if none is specified explicitly along with the output file name).
>
> Unlabeled text box. This is disabled if Use Source Folder is checked.
> Otherwise, it allows you to specify a default destination folder.
> Browse button which will bring up a dialog that allows you to select a 
> default destination folder.
>
> Save Defaults button, ALT+V. Use it to save your current settings as 
> default settings. Once you have done this, these are the settings that 
> will be used when you choose to recognize a file or folder 
> automatically.
> Status, ALT+S is a read-only text box that tells you when recognition 
> of a page is completed and will include recognition hints if you are 
> using FineReader.
>
> -----Original Message-----
> From: BlindLaw [mailto:blindlaw-bounces at nfbnet.org] On Behalf Of Steve 
> Jacobson via BlindLaw
> Sent: Saturday, September 2, 2017 8:10 PM
> To: 'Blind Law Mailing List' <blindlaw at nfbnet.org>
> Cc: Steve Jacobson <steve.jacobson at visi.com>
> Subject: Re: [blindlaw] Batch Recognition of OCR
>
> Actually, Kurzweil 1,000 gives one the choice of using the FineReader 
> engine or the Omnipage Engine.  The version I have lets one choose 
> between FineReader 11.0 and OmniPage 19.0.
>
> Best regards,
>
> Steve Jacobson
>
> -----Original Message-----
> From: BlindLaw [mailto:blindlaw-bounces at nfbnet.org] On Behalf Of Aaron 
> Cannon via BlindLaw
> Sent: Saturday, September 02, 2017 3:56 PM
> To: Andrew Webb <awebb2168 at gmail.com>
> Cc: Aaron Cannon <cannona at fireantproductions.com>; Blind Law Mailing 
> List <blindlaw at nfbnet.org>
> Subject: Re: [blindlaw] Batch Recognition of OCR
>
> It looks like K1000 uses the Finereader Engine under the covers, so it 
> should still be pretty good.
>
> Aaron
>
> --
> This message was sent from a mobile device
>
>
>> On Sep 2, 2017, at 15:50, Andrew Webb <awebb2168 at gmail.com> wrote:
>>
>> How does Kurzweil 1000 stack up against these other programs? Is it 
>> considered obsolete by this point?
>>
>> -----Original Message-----
>> From: BlindLaw [mailto:blindlaw-bounces at nfbnet.org] On Behalf Of 
>> Aaron Cannon via BlindLaw
>> Sent: Friday, September 01, 2017 4:50 PM
>> To: Blind Law Mailing List
>> Cc: Aaron Cannon
>> Subject: Re: [blindlaw] Batch Recognition of OCR
>>
>> I believe AbbYY Finereader Corporate (not Standard) version has this 
>> capability. FineReader also tends to win in accuracy against OmniPage 
>> in head-to-head tests.
>>
>> Aaron
>>
>> --
>> This message was sent from a mobile device
>>
>>
>>> On Sep 1, 2017, at 16:28, Singh, Nandini via BlindLaw
>> <blindlaw at nfbnet.org> wrote:
>>>
>>> I am not sure what program you have now, but I use Omni Page by 
>>> Nuance,
>> and I can convert 10-30 documents from PDF to Word or text depending 
>> on
> the
>> size all in one go. I have tried to do more documents, but that 
>> really
> slows
>> down things. While the conversion  is running in the background, I 
>> can
> still
>> check email, review other documents, etc.
>>>
>>> -----Original Message-----
>>> From: BlindLaw [mailto:blindlaw-bounces at nfbnet.org] On Behalf Of Tai
>> Tomasi via BlindLaw
>>> Sent: Friday, September 1, 2017 5:21 PM
>>> To: Blind Law Mailing List
>>> Cc: Tai Tomasi
>>> Subject: [blindlaw] Batch Recognition of OCR
>>>
>>> Hello all. I am looking for a program that will monitor a given 
>>> folder
> for
>> new PDF files and convert inaccessible PDF files to accessible PDF
>> (PDF/a) or Microsoft Word files. Does anyone know of a program that 
>> can do this
> type
>> of automated batch OCR conversion? Right now, I have to initiate the 
>> OCR process with a command for each document and rename the new 
>> document to
> the
>> same filename as the original PDF with a .docx estension. This is not 
>> an efficient use of my time. Thanks.
>>>
>>>
>>> Ms. Tai Tomasi, J.D.
>>> Pronouns: she/her/hers
>>> Staff Attorney
>>>
>>> [Description: DR%20IA%20LawCenter]
>>>
>>> 400 East Court Ave., Ste. 300
>>> Des Moines, Iowa 50309
>>> Tel: 515-278-2502; Toll Free: 1-800-779-2502
>>> FAX: 515-278-0539; Relay 711
>>> E-mail: ttomasi at driowa.org<mailto:ttomasi at driowa.org>
>>> www.driowa.org
>>>
>>> Our Mission:  To defend and promote the human and legal rights of 
>>> Iowans
>> with disabilities
>>>
>>> CONFIDENTIALITY NOTICE
>>>
>>> This e-mail and any attachments contain information from the law 
>>> firm of
>> Disability Rights Iowa and are intended solely for the use of the 
>> named recipient(s). This e-mail may contain privileged 
>> attorney-client communications or work product. Any dissemination by 
>> anyone other than an intended recipient is prohibited. If you are not 
>> a named recipient, you
> are
>> prohibited from any further viewing of the e-mail or any attachments 
>> or
> from
>> making any use of the e-mail or attachments. If you have received 
>> this e-mail in error, notify the sender immediately and delete the 
>> e-mail, any attachments, and all copies from any drives or storage 
>> media and destroy
> any
>> printouts.
>>>
>>>
>>>
>>> _______________________________________________
>>> BlindLaw mailing list
>>> BlindLaw at nfbnet.org
>>> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
>>> To unsubscribe, change your list options or get your account info 
>>> for
>> BlindLaw:
>>>
>>
> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/cannona%40firean
> tprodu
>> ctions.com
>>
>> _______________________________________________
>> BlindLaw mailing list
>> BlindLaw at nfbnet.org
>> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
>> To unsubscribe, change your list options or get your account info for
>> BlindLaw:
>>
> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/awebb2168%40gmai
> l.com
>>
>>
>> ---
>> This email has been checked for viruses by Avast antivirus software.
>> https://www.avast.com/antivirus
>>
>
> _______________________________________________
> BlindLaw mailing list
> BlindLaw at nfbnet.org
> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
> To unsubscribe, change your list options or get your account info for
> BlindLaw:
> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/steve.jacobson%40visi.
> com
>
>
>
> _______________________________________________
> BlindLaw mailing list
> BlindLaw at nfbnet.org
> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
> To unsubscribe, change your list options or get your account info for
> BlindLaw:
> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/agtolentino%40gm
> ail.co
> m
>
>
> _______________________________________________
> BlindLaw mailing list
> BlindLaw at nfbnet.org
> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
> To unsubscribe, change your list options or get your account info for
> BlindLaw:
> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/gerard.sadlier%4
> 0gmail.com
>

_______________________________________________
BlindLaw mailing list
BlindLaw at nfbnet.org
http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
To unsubscribe, change your list options or get your account info for BlindLaw:
http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/ttomasi%40driowa.org




More information about the BlindLaw mailing list