[blindlaw] Batch Document Recognition in Kurzweil 1000

Tai Tomasi ttomasi at driowa.org
Wed May 9 15:03:43 UTC 2018


Laura and all: 

Below, I have pasted the instructions for batch scanning using the KOCR utility in Kurzweil 1000 version 14. These instructions should be similar for other recent versions of Kurzweil 1000.  I am also attaching PDF and Word versions of the Kurzweil 1000 users guide.

KOCRUtil for Automatic File Recognition.
If you have a multi-core processor on your machine, you can use KOCRUtil to recognize files and files in folders automatically and silently.
If the OCR engine can keep the recognized data for the entire document and then convert it, it will attempt to unify its formatting decisions so that the final document is more consistent.
Before using KOCRUtil, however, consider the tradeoffs. Corrections, for example, will not be applied to each page. You won't be able to edit or read the document as it is recognized. Bookmarks will not be captured for PDF files.
And the resulting document won't be in KES format (though most of the choices will produce output that can be converted to KES by opening them within Kurzweil 1000).
To run KOCRUtil:
1.	Select a file, a list of files, or a folder in Windows Explorer.
2.	Bring up the context menu, and select the appropriate menu item. For a folder, that menu item would be "Recognize Images with Kurzweil." For image files (either TIFF, PDF, JPEG, or PNG), you can pick either
"Recognize Images with Kurzweil Automatically", or "Recognize Images with Kurzweil Interactively".
When you recognize images automatically, KOCRUtil.exe will run without bringing up a window. It will use current default settings to recognize the selected image files, or to recognize all of the image files found within the selected folder, and then exit. When it exits, you will hear a wave file
"KOCRUtil.wav," if that file exists.
If you selected a folder and then activated the "Recognize Images with Kurzweil" menu item, KOCRUtil would look for all image files within that folder (but not, note within sub folders). These files would be organized into one or more group. Files are in the same group if their file names are identical except for digits. So, for example, "Image001.tif", "Image2.tif", and "Image43.tif" are all in the same group, but "Imagea3" is not. Groups of image files are sorted by
their name, recognized together, and output into one resulting document.
Output file names are based on the name of the first image file in a group of image files, along with an extension that is appropriate for the output format. Depending upon settings, the output files can be in the same folder as the image files, or can be sent to a specified folder.
The default is for KOCRUtil to use FineReader Engine with English as the only recognition language, creating an RTF file that will be placed in the same folder as the image file or files.
To exit KOCRUtil:
Press Escape or TAB to the Exit button and press Enter.
To change KOCRUtil settings:
Either run KOCRUtil.exe without command line arguments, or use the "Recognize Images with Kurzweil Interactively" context menu. This will bring up KOCRUtil, which has a single dialog.
The dialog controls are described below in tab order. Where applicable, the mnemonic follows.
Image Files group has a text box, ALT+I, where you can specify one or more image files, separated by semicolons. There is also a Browse button which brings up a file open dialog so you can select the desired image files from your system.
Output File group has a text box, ALT+O and a Browse button. In the text box you specify the output file. Note that it can be blank, in which case the output file name is constructed using the first image file name. If no path is specified, the source folder will be used, or the default destination folder will be used, depending on that setting (see below). You can also click the Browse button to bring up a file save dialog in which you can specify the output file.
Format list box, ALT+O, lets you choose the format of the output file. The list of possible formats changes depending on the recognition engine used.
Note: As of October 2016, with a full install of K1000 V14.09 and above the FineReader will be used.
Details button, Alt+D brings up the Format Details dialog in which you can change format settings. The dialog contains: a Layout list (Alt+L) where you can opt to Retain Layout, Formatted Text, or Plain Text. Next is the Paper Size list (Alt+P); choose Automatic, A3, A4, A5, Letter, or Legal. The third list is
labeled Pictures (Alt+C); choose to Remove Pictures, Low Resolution (for Web), Medium Resolution (for screen), High Resolution (for printing). Four check boxes follow the lists. You can opt to Keep Page Breaks (Alt+G), Keep Line Breaks (Alt+N), Keep Text Color (Alt+T), and Keep Headers and Footers (Alt+H). By default, Kurzweil 1000 keeps Formatted Text for the layout, uses Automatic paper size selection, Removes Pictures, Keeps Page Breaks, Text Color, and Headers and Footers, but does not Keep Line Breaks. These Format Details settings are retained for future sessions until you change them again.
Recognition Engine list box, ALT+R. Choose the recognition engine, FineReader Engine or OmniPage Engine.
Note: As of October 2016, the OmniPage Engine will no longer be available with a full install of Kurzweil 1000 V14.09 and above.
Recognition Languages list view, ALT+L. Check one or more of the
recognition languages. The list changes depending on the recognition engine.
Note: As of October 2016, with a full install of K1000 V14.09 and above the FineReader language list will be used.
Start Recognition button, ALT+S. Use it to start recognition if everything else is set up properly.
The next three controls are in a group box labeled Default Destination.
Use Source Folder check box, ALT+U. If set, the folder of the image file will be used to specify the default destination folder (i.e., the folder used if none is specified explicitly along with the output file name).
Unlabeled text box. This is disabled if Use Source Folder is checked. Otherwise, it allows you to specify a default destination folder.
Browse button which will bring up a dialog that allows you to select a default destination folder.
Save Defaults button, ALT+V. Use it to save your current settings as default settings. Once you have done this, these are the settings that will be used when you choose to recognize a file or folder automatically.
Status, ALT+S is a read-only text box that tells you when recognition of a page is completed and will include recognition hints if you are using FineReader.

Ms. Tai Tomasi, J.D.
Pronouns: she/her/hers
Staff Attorney



400 East Court Ave., Ste. 300
Des Moines, Iowa 50309
Tel: 515-278-2502; Toll Free: 1-800-779-2502
FAX: 515-278-0539; Relay 711
E-mail: ttomasi at driowa.org
www.driowa.org

Our Mission:  To defend and promote the human and legal rights of Iowans with disabilities

CONFIDENTIALITY NOTICE

This e-mail and any attachments contain information from the law firm of Disability Rights Iowa and are intended solely for the use of the named recipient(s). This e-mail may contain privileged attorney-client communications or work product. Any dissemination by anyone other than an intended recipient is prohibited. If you are not a named recipient, you are prohibited from any further viewing of the e-mail or any attachments or from making any use of the e-mail or attachments. If you have received this e-mail in error, notify the sender immediately and delete the e-mail, any attachments, and all copies from any drives or storage media and destroy any printouts.



-----Original Message-----
From: BlindLaw <blindlaw-bounces at nfbnet.org> On Behalf Of Laura Wolk via BlindLaw
Sent: Wednesday, May 09, 2018 9:37 AM
To: Blind Law Mailing List <blindlaw at nfbnet.org>
Cc: Laura Wolk <laura.wolk at gmail.com>
Subject: Re: [blindlaw] Batch OCR

I must have missed those instructions, but I would absolutely love to have them again.

On 5/9/18, Sybren Hoekstra via BlindLaw <blindlaw at nfbnet.org> wrote:
> It can, and someone on this listserv has given instructions on how to 
> do it before. But I don’t know how
>
> Sent from my iPhone
>
>> On May 9, 2018, at 09:14, Andrew Webb via BlindLaw 
>> <blindlaw at nfbnet.org>
>> wrote:
>>
>> Can this also be done using Kurzweil 1000?
>>
>> -----Original Message-----
>> From: BlindLaw [mailto:blindlaw-bounces at nfbnet.org] On Behalf Of Tai 
>> Tomasi via BlindLaw
>> Sent: Wednesday, May 9, 2018 7:01 AM
>> To: Blind Law Mailing List
>> Cc: Tai Tomasi
>> Subject: Re: [blindlaw] Batch OCR
>>
>> ABBYY FineReader pro 14 can do this. I have never used the feature so 
>> can’t speak to its accuracy.
>>
>> Tai Tomasi, J.D., M.P.A.
>> Email: ttomasi at driowa.org<mailto:ttomasi at driowa.org>
>> Sent from my iPhone. Please excuse my brevity and any grammatical errors.
>>
>> On May 9, 2018, at 6:57 AM, Rahul Bajaj via BlindLaw 
>> <blindlaw at nfbnet.org<mailto:blindlaw at nfbnet.org>> wrote:
>>
>> I am assuming you have tried ABBYY Fine Reader and it does not have 
>> the folder conversion functionality?
>>
>> On 09/05/2018, Angie Matney via BlindLaw 
>> <blindlaw at nfbnet.org<mailto:blindlaw at nfbnet.org>> wrote:
>> Hello. I'd be interested to know if anyone can recommend a good 
>> program that will let me convert a lot of pdf files to Word or text 
>> with minimal interaction. Ideally, I'd be able to point the software 
>> to a particular folder and have it go to town with the processing.
>>
>> Thanks for any recommendations you can provide.
>>
>> Angie
>>
>>
>> Sent from my iPhone
>> _______________________________________________
>> BlindLaw mailing list
>> BlindLaw at nfbnet.org<mailto:BlindLaw at nfbnet.org>
>> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
>> To unsubscribe, change your list options or get your account info for
>> BlindLaw:
>> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/rahul.bajaj1038
>> %40gmail.com
>>
>>
>> _______________________________________________
>> BlindLaw mailing list
>> BlindLaw at nfbnet.org<mailto:BlindLaw at nfbnet.org>
>> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
>> To unsubscribe, change your list options or get your account info for
>> BlindLaw:
>> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/ttomasi%40driow
>> a.org _______________________________________________
>> BlindLaw mailing list
>> BlindLaw at nfbnet.org
>> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
>> To unsubscribe, change your list options or get your account info for
>> BlindLaw:
>> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/awebb2168%40gma
>> il.com
>>
>>
>> _______________________________________________
>> BlindLaw mailing list
>> BlindLaw at nfbnet.org
>> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
>> To unsubscribe, change your list options or get your account info for
>> BlindLaw:
>> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/sy.hoekstra%40g
>> mail.com
>
> _______________________________________________
> BlindLaw mailing list
> BlindLaw at nfbnet.org
> http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
> To unsubscribe, change your list options or get your account info for
> BlindLaw:
> http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/laura.wolk%40gma
> il.com

_______________________________________________
BlindLaw mailing list
BlindLaw at nfbnet.org
http://nfbnet.org/mailman/listinfo/blindlaw_nfbnet.org
To unsubscribe, change your list options or get your account info for BlindLaw:
http://nfbnet.org/mailman/options/blindlaw_nfbnet.org/ttomasi%40driowa.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kurzweil-1000-manual-160927.pdf
Type: application/pdf
Size: 2166191 bytes
Desc: kurzweil-1000-manual-160927.pdf
URL: <http://nfbnet.org/pipermail/blindlaw_nfbnet.org/attachments/20180509/e38e71e2/attachment.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kurzweil-1000-manual-160927.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 305024 bytes
Desc: kurzweil-1000-manual-160927.docx
URL: <http://nfbnet.org/pipermail/blindlaw_nfbnet.org/attachments/20180509/e38e71e2/attachment.docx>


More information about the BlindLaw mailing list