[nabs-l] question about Jaws and PDFs

Rania raniaismail04 at gmail.com
Sun Jan 24 21:00:52 UTC 2010


Thank you! I will have to do that!
Rania,
"For everyone who thought I couldn't do it.
For everyone who thought I shouldn't do it.
For everyone who said, 'It's impossible."
See you at the finish line."
~Christopher Reeve

----- Original Message ----- 
From: "Darrell Shandrow" <darrell.shandrow at gmail.com>
To: "'National Association of Blind Students mailing list'" 
<nabs-l at nfbnet.org>
Sent: Sunday, January 24, 2010 2:21 PM
Subject: Re: [nabs-l] question about Jaws and PDFs


> Hello Rania,
>
> My experience is that PDF2TXT is quite a bit easier to use than Adobe
> Reader, but it's free, so you may try it for yourself. :-)
>
>
> -----Original Message-----
> From: nabs-l-bounces at nfbnet.org [mailto:nabs-l-bounces at nfbnet.org] On 
> Behalf
> Of Rania
> Sent: Sunday, January 24, 2010 9:14 AM
> To: National Association of Blind Students mailing list
> Subject: Re: [nabs-l] question about Jaws and PDFs
>
> Is it easier to use then Adoby? just asking because I don't like Adoby all
> that much.
> Rania,
> "For everyone who thought I couldn't do it.
> For everyone who thought I shouldn't do it.
> For everyone who said, 'It's impossible."
> See you at the finish line."
> ~Christopher Reeve
>
> ----- Original Message -----
> From: "Darrell Shandrow" <darrell.shandrow at gmail.com>
> To: "'National Association of Blind Students mailing list'"
> <nabs-l at nfbnet.org>
> Sent: Sunday, January 24, 2010 9:50 AM
> Subject: Re: [nabs-l] question about Jaws and PDFs
>
>
>> Hello Rania,
>>
>> I assume you're asking about PDF2TXT? The documentation is below:
>>
>> Version 3.3
>> April 6, 2009
>> Copyright 2005 - 2009 by Jamal Mazrui
>> LGPL license
>> Contents
>> Description
>> Installation
>> Choosing PDF Source and TXT Target
>> Text Extraction Settings
>> Viewing Area
>> Toggling between a File and Folder List
>> Configuration Check Boxes
>> Action Buttons
>> URL Source,
>> Hot Keys
>> The Log File
>> Command Line Operation
>> File Association
>> Development Notes
>> Description
>>
>> PDF to TXT (also written PDF2TXT) is a free program for converting files
>> in
>> Portable Document Format (.pdf extension) to plain text(.txt extension).
>> The
>> program lets you convert multiple files in a single, batch operation,
>> either
>> from a GUI dialog or a console-mode command line. The resulting text 
>> files
>> can be read in almost any editing or viewing program. PDF2TXT, itself,
>> also
>> includes a plain text view for reading PDF files. The program should work
>> on
>> any version of Windows.
>>
>> Installation
>>
>> The installation program for PDF2TXT is called p2tsetup.exe. When
>> executed,
>> it prompts for an installation folder for the program. The default folder
>> is
>> c:\pdf2txt. Although this is not a standard location for programs on a
>> Windows computer, a benefit is fewer keystrokes to type whenever you
>> manually enter the path to a PDF file or folder. If you want a standard
>> installation folder, however, respond to the prompt by entering
>> C:\Program Files\PDF to Text
>> The installation process creates a program group for PDF2TXT on the
>> Windows
>> start menu, containing choices to launch PDF2TXT, read Documentation for
>> PDF2TXT, and uninstall PDF2TXT. Also created is a desktop shortcut with 
>> an
>> associated hot key, enabling PDF2TXT to be conveniently launched by
>> pressing
>> Control+Alt+Shift+P. Another shortcut is placed in the Send To folder so
>> that a PDF may be viewed in PDF2TXT via the context menu in Windows
>> Explorer.
>>
>>
>> Choosing PDF Source and TXT Target
>>
>> After PDF2TXT is installed, launching it activates a main dialog with
>> several capabilities and settings. First, it prompts you to select a PDF
>> source. This can be either a single PDF file or a folder containing
>> multiple
>> PDF files (another section explains how it can also be an Internet URL).
>> In
>> the initial edit box, you can type the full path to the file or folder
>> desired. Alternatively, you can tab to buttons that invoke different sub
>> dialogs depending on whether you want to choose a file or folder as the
>> PDF
>> source. (Yet another option, described later, is to pass the path to the
>> PDF
>> source as a parameter on the command line when pdf2txt.exe is launched.)
>> By default, the PDF source is the folder c:\pdf2TXT\pdf. Any source may 
>> be
>> chosen, however, and the program remembers the last one used.
>>
>> Similarly, an edit box and associated button let you specify the target
>> folder for converted files. These will have the same base name, but an
>> extension of .txt instead of .pdf. The default target folder is
>> c:\pdf2TXT\txt. Note that the PDF source may be either a file or folder,
>> but
>> the TXT target is always a folder.
>>
>>
>> Text Extraction Settings
>>
>> Two settings fundamentally affect how text is extracted from a PDF. If 
>> the
>> PDF requires a password to unlock its content, type it in the edit box
>> provided. If the PDF is an image format without textual characters --
>> e.g.,
>> the result of a scan -- mark the checkbox so that optical character
>> recognition (OCR) is performed instead of the usual techniques of
>> extracting
>> text. This OCR technique was originally posted at
>> http://EmpowermentZone.com/pdf2ocr.zip
>> OCR is a much slower and more error-prone process, but it may be the best
>> option when the usual methods do not work. This technique uses Google
>> Tesseract, the best open source OCR available, which is not as good as
>> commercial OCR packages. Due to technical issues, there is not a simple
>> way
>> of aborting an OCR process that has already started. It is possible,
>> however, by launching another copy of PDF2TXT, which clears the deck
>> during
>> its startup phase.
>>
>> Another checkbox lets you additionally produce a .htm target file -- in
>> HTML
>> format. This uses a different conversion technique, originally posted at
>> http://EmpowermentZone.com/pdf2htm.zip
>>
>> This may be worth trying if the .txt result is unsatisfactory. It may 
>> also
>> be useful for webmasters who want to post AN HTML alternative to a PDF.
>> The
>> conversion translates visual aspects of the PDF such as fonts, but not
>> structural elements such as headings, unfortunately. To further increase
>> conversion options, a different technique is also used for producing the
>> .txt file with this checkbox, using the PDFToText.exe utility that is 
>> also
>> seperately available at
>> http://www.foolabs.com/xpdf/home.html
>>
>>
>> Viewing Area
>>
>> Within the main dialog, a read-only, multi-line edit control serves as a
>> viewing area between the source and target controls just discussed. This
>> scrollable view can show one of three kinds of information: (1) the text
>> of
>> a PDF, 2) a list of PDF files, or (3) the results of a batch conversion.
>> The
>> label for the viewing area changes to indicate the kind of information
>> being
>> shown: "View file," "View folder," or "View results."
>> You can navigate the viewing area with standard windows keystrokes, e.g.,
>> Control+Home or Control+End to go to the top or bottom of text. Control+F
>> lets you search forward for a string of characters, and Control+Shift+F
>> lets
>> you search backward. F3 searches for the same string again in the forward
>> direction, and Shift+F3 searches again backward. Control+G lets you go to
>> a
>> percent completion point through the file being viewed. Control+K sets a
>> bookmark for the file, Control+Shift+K clears it, and Alt+K goes to it.
>>
>> You can press Shift with arrow keys to select text or Control+A to select
>> all. Alternatively, you can press F8 to set the starting point of a
>> selection, navigate to the ending point desired, and then press Shift+F8
>> to
>> select the text between these points.
>>
>> Press Control+C to copy selected text to the clipboard. Alternatively,
>> press
>> Control+Shift+C, or Alt+F8, to copy and append to the clipboard, adding 
>> to
>> rather than replacing its existing text. A form feed or page break
>> character
>> (ANSI code 12) will separate each clip copied there. Control+F8 is a
>> shortcut that copies all text in the viewing area without having to 
>> select
>> it first, equivalent to Control+A followd by Control+C.
>>
>> If you invoke the Open button and choose a PDF from its sub dialog, the
>> text
>> of the PDF will be placed in the viewing area, and keyboard focus will go
>> there. If you invoke the Select button to choose a PDF folder instead of 
>> a
>> file, its list of PDFs will be shown. A status bar at the bottom of the
>> dialog indicates the current position in the viewing area.
>>
>>
>> Toggling between a File and Folder List
>>
>> The Look button behaves in a special way when the viewing area has focus.
>> If
>> you press Alt+L when in the viewing area, PDF2TXT will toggle between a
>> folder and file view. If viewing a folder, PDF2TXT will switch to a view
>> of
>> the file that was on the line containing the caret. If viewing a file,
>> PDF2TXT will switch to a view of the folder that contained the file. In
>> addition, PDF2TXT will automatically search for the name of the file last
>> viewed and place the caret just after it if found.
>> This feature lets you easily explore the PDFs in a folder, one after
>> another. Initially, You might display a list of files by pressing Alt+L
>> when
>> the PDF source is a folder. You can then arrow down through the list 
>> until
>> you find a PDF you want to view. At that point, press Alt+L to view the
>> file. When you want to continue exploring the folder list again, press
>> Alt+L
>> to return to it at the position of the file you last viewed.
>>
>>
>> Configuration Check Boxes
>>
>> Four check boxes let you configure PDF2TXT. The one labeled "Include
>> subfolders," will look for PDF files not only in the specified folder, 
>> but
>> in subfolders under it. For example, you could probably convert many PDF
>> files on your computer by checking this setting and specifying the c:\
>> root
>> folder as the PDF source! This setting is unchecked by default.
>> The check box labeled "Move PDF when done" will transfer a PDF to a
>> subfolder called "Done" after a successful conversion. This is a 
>> subfolder
>> of the PDF2TXT program folder, with a default location of 
>> c:\pdf2TXT\done.
>> The benefit of this check box is that PDF files are stored away for 
>> backup
>> after they have been converted to text. This setting is unchecked by
>> default.
>>
>> The checkbox labeled "Replace TXT if found" determines whether to skip a
>> conversion if a corresponding target file already exists. If you do not
>> check the setting to move source files when done, you may want to check
>> this
>> setting so that unnecessary time is not spent on repeatedly converting 
>> PDF
>> files left in the source folder, since they then will be skipped if
>> corresponding target files already exist. This setting is checked by
>> default.
>>
>> The Append check box determines whether a detailed conversion log file is
>> newly created each time a conversion is run. This setting is checked by
>> default so that previous information is not lost. A section below further
>> describes the log file.
>>
>>
>> Action Buttons
>>
>> The remaining controls of the main dialog are buttons that perform 
>> various
>> actions. The Convert button is the default: the one that will be 
>> activated
>> by pressing Enter on any control except another button. The viewing area
>> will show the results of a batch conversion. This information includes 
>> the
>> number of pages in each PDF converted. It also indicates when a 
>> conversion
>> was either not possible or was skipped because the target file already
>> existed and you chose not to replace files.
>> Press Escape if you need to abort a batch conversion of many files that 
>> is
>> taking too long! Note that this program is relatively quick, however,
>> compared to other available methods of converting PDF files to text.
>> Moreover, its batch mode feature lets you run conversions unattended.
>>
>> The source for a conversion is treated differently if the viewing area 
>> has
>> focus. If viewing a list of PDFs in a folder or on a web page, then
>> PDF2TXT
>> regards the source as the file name on the current line (the one
>> containing
>> the caret). Thus, you can cursor to a PDF of interest and press Enter to
>> convert it to text. If successfully converted, PDF2TXT assumes you may
>> also
>> want to examine its content in the viewing area, so a Look command is
>> automatically performed as well (see below). If there is a conversion
>> error,
>> however, PDF2TXT leaves the error message in the viewing area. If you 
>> have
>> been examining a list of PDFs and decide you want to convert them all
>> rather
>> than a single file, navigate to the top line of the viewing area that
>> lists
>> the number of PDFs in the list, and then press Enter.
>>
>> If the source edit box already specifies what you want to view, or a path
>> is
>> easy to type into it, then the Look button is quicker to use than the 
>> Open
>> or Select sub dialog. Activating the Look button takes the current source
>> specification and goes to a view of either the text of a source file or
>> the
>> list of a source folder, putting focus in the view area so you can read
>> the
>> information.
>>
>> The Defaults button restores the default configuration settings of
>> PDF2TXT.
>> You can use it to return to the initial folders and checkbox settings.
>>
>> The Explorer button lets you browse the source, target, or done folder
>> with
>> Windows Explorer. It allows you to examine files that either have been
>> converted or would not convert--thus needing other approaches to access
>> their content.
>>
>> The Quit button closes PDF2TXT. Alt+F4 does the same thing.
>>
>> The Help button displays this complete documentation in the default web
>> browser. For context-sensitive help on a particular control, press F1 
>> when
>> it has focus. Hence, you can tab through the dialog and press F1 on each
>> control to learn how to use it.
>>
>>
>> URL Source,
>>
>> If you are connected to the Internet, you can specify a URL as a PDF
>> source
>> instead of a file or folder on your local computer. The URL can be the
>> complete download path to a PDF on the Internet. Alternatively, the URL
>> can
>> be the path to a web page containing one or more links to PDF files. You
>> can
>> use Internet Explorer to navigate to such a web page and then invoke the
>> "Grab URL" button to put its URL into the source edit box of PDF2TXT.
>> The Look button works with a URL source similarly to a local file or
>> folder.
>> For example, you can press Alt+L to view a list of PDFs on a web page. 
>> The
>> toggling feature, described above, is also supported, allowing you to
>> consecutively examine the PDFs linked to a web page. If you view a PDF on
>> the Internet, PDF2TXT will automatically download a copy to the PDF
>> subfolder of the program folder, e.g., to
>> c:\pdf2txt\pdf
>>
>> The Convert button also works with a URL source. Thus, you can easily
>> convert all PDFs on a web page with a single batch operation!
>>
>>
>> Hot Keys
>>
>> Almost all controls of PDF2TXT are directly usable with unique, mnemonic
>> Alt
>> key combinations based on the initial letter of the control's label. 
>> Thus,
>> as you become familiar with the controls, you can operate them more
>> quickly
>> with hot keys rather than navigating to them with the tab key or mouse.
>> For
>> example, press Alt+P to go to the edit box for typing a PDF source, or
>> Alt+S
>> to select a source folder from a tree view of your computer. Press Alt+L
>> to
>> look at a file or folder, or Alt+V to red what is already in the viewing
>> area. Press Alt+I to toggle the "Include subfolders" setting, or Alt+D to
>> restore all defaults. The text extraction settings in the second row of
>> controls use a letter corresponding to the second syllable or word, i.e.,
>> Alt+W for the Password edit box and Alt+F for the Image Format checkbox.
>>
>> The Log File
>>
>> The conversion log file is named log.txt and located in the Done 
>> subfolder
>> of the PDF2TXT program folder. It records information about each attempt
>> to
>> convert a PDF to TXT file. It indicates whether the conversion succeeded
>> (meaning any resulting text), and then lists many attributes of the PDF,
>> including security settings that could explain a failed conversion.
>> There is a choice to view the log file in the PDF2TXT program group off
>> the
>> Start Menu. You can also get to the file via the Explore button of the
>> PDF2TXT program, choosing the Done folder to navigate with Windows
>> Explorer.
>> Additionally, you can open the file in another application through its
>> direct path (default settings):
>> c:\pdf2txt\done\log.txt
>>
>> If the log file grows larger than you want, simply delete it or uncheck
>> the
>> setting that configures PDF2TXT to append to an existing log file. Each
>> use
>> of the Convert button would then generate a new log file. This 
>> information
>> is more detailed than the results placed in the viewing area.
>>
>>
>> Command Line Operation
>>
>> The pdf2txt.exe executable may be run with various command line
>> parameters.
>> The parameters can set values for controls in the main dialog. Parameters
>> can also cause PDF2TXT to run in an automatic, console mode--without a
>> dialog box or further user intervention involved.
>> When the .pdf extension is associated with the PDF2TXT program (explained
>> in
>> another section), Windows Explorer or Internet Explorer will open a PDF
>> file
>> by launching PDF2TXT with the name of the PDF passed as a parameter on 
>> the
>> command line. If PDF2TXT is launched with more than one command line
>> parameter, however, the program will assume you want to run it in console
>> rather than GUI mode. The syntax for parameters is described as follows.
>> If
>> a PDF source file, folder, or URL is specified, it must be the first
>> parameter. If a TXT target folder is specified, it must be the second
>> parameter. The source or target must be enclosed in quotes if its name
>> contains spaces.
>>
>> All parameters besides source and target names begin with a space and
>> forward slash (/), followed by the hot key letter in the dialog
>> corresponding to the setting affected. A trailing plus (+) sign in the
>> parameter indicates a status of On, and a minus (-) sign indicates Off.
>> The
>> plus sign can also be omitted to indicate On. Capitalization does not
>> matter. Here is a list of parameters:
>>
>> a = Automatic, console mode (use /a- to force GUI mode with multiple
>> parameters)
>> i = Include subfolders
>> m = Move PDF when done
>> r = Replace TXT if found
>> d = Default settings (no /d- is defined)
>> g = Grab URL as source from Internet Explorer (no /g- is defined)
>>
>> For example, to convert all files using default settings except for the
>> Move
>> setting, you could enter:
>> pdf2txt /d /m
>>
>> To use current settings except grab a URL as source, enter:
>> pdf2txt /a /g
>>
>> To convert files from a temporary folder to the current folder, enter:
>> pdf2txt "c:\temp files" .
>>
>> To do the same, but in GUI rather than console mode, enter:
>> pdf2txt "c:\temp files" . /a-
>>
>> For greater console mode convenience, another version of PDF2TXT, having
>> the
>> abbreviated name p2t.exe, is also available in the program folder. This
>> version only runs in console mode, whether zero, one, or more parameters
>> are
>> specified. It uses "standard output" to display conversion results. The
>> shorter executable name means less characters to type on the command 
>> line.
>> For example, to run a batch conversion in console mode using the current
>> settings of PDF2TXT, you could simply enter
>> p2t
>>
>> Like DOS commands generally, the above assumes that you have either made
>> c:\pdf2txt the current directory or included it in a PATH statement.
>>
>>
>> File Association
>>
>> The PDF2TXT group on the Start Menu contains shortcuts for changing what
>> program automatically opens a file with a .pdf extension in Windows
>> Explorer. If you decide that you like the interface of PDF2TXT enough to
>> make it the default program for PDF files, you can set the file
>> association
>> accordingly. Later, if you decide you want to return to the conventional
>> association, you can do that, too.
>> When the .pdf extension is associated with PDF2TXT, an application such 
>> as
>> Windows Explorer when opening a file, or Internet Explorer after
>> downloading
>> a file, will pass the name of the PDF as a command-line parameter to
>> pdf2txt.exe. When the program is launched in this way, it automatically
>> invokes the Look button, placing text of the PDF in the viewing area and
>> putting keyboard focus there.
>>
>>
>> Development Notes
>>
>> I welcome comments and suggestions on PDF to TXT. For the technically
>> curious, I developed it with the PowerBASIC programming language from
>> http://PowerBASIC.com
>> and a couple of third party libraries: EZGUI from
>> http://EZGUI.com
>> and QuickPDF from
>> http://QuickPDF.com
>> An alternate text extraction technique is tried if the first one fails,
>> using the GetText.exe utility that is also available seperately at
>> http://www.kryltech.com
>> The file GetText.txt in the PDF2TXT program folder contains the license
>> for
>> this utility.
>>
>> The OCR is done by incorporating the open source PDF2OCR package,
>> available
>> at
>> http://EmpowermentZone.com/pdf2ocr.zip
>>
>> Some status messages are spoken with the JAWS, System Access, or
>> Window-Eyes
>> screen reader if currently active. These direct speech messages are
>> produced
>> with APIs via a component of the SayTools library, which is also 
>> available
>> seperately at
>> http://EmpowermentZone.com/saysetup.exe
>>
>> The PowerBASIC code to PDF2TXT, itself (but not commercial libraries
>> used),
>> is open source under the Lesser General Public License (LGPL), documented
>> at
>> http://gnu.org
>>
>> This Windows program is the successor to my first version of PDF2TXT,
>> developed several years ago as a DOS-based, command-line only utility.
>> Ideas
>> and feedbak from the discussion list
>> ProgrammingBlind at FreeLists.org
>> have aided the design and testing of PDF2TXT. The latest version is
>> available at the same address,
>> http://EmpowermentZone.com/p2tsetup.exe
>>
>> You can download it with the Elevate Version hotkey, F11. This checks
>> whether a newer version is available, and offers to install it.
>>
>> Jamal Mazrui
>> jamal at EmpowermentZone.com
>>
>> -----Original Message-----
>> From: nabs-l-bounces at nfbnet.org [mailto:nabs-l-bounces at nfbnet.org] On
>> Behalf
>> Of Rania
>> Sent: Sunday, January 24, 2010 7:21 AM
>> To: National Association of Blind Students mailing list
>> Subject: Re: [nabs-l] question about Jaws and PDFs
>>
>> I have never herd about this!
>> Can you give me more information?
>> Rania,
>> "For everyone who thought I couldn't do it.
>> For everyone who thought I shouldn't do it.
>> For everyone who said, 'It's impossible."
>> See you at the finish line."
>> ~Christopher Reeve
>>
>> ----- Original Message -----
>> From: "Darrell Shandrow" <darrell.shandrow at gmail.com>
>> To: "'National Association of Blind Students mailing list'"
>> <nabs-l at nfbnet.org>
>> Sent: Sunday, January 24, 2010 6:31 AM
>> Subject: Re: [nabs-l] question about Jaws and PDFs
>>
>>
>>> Hello Rachel,
>>>
>>> In order to help you in the most effective way possible, let me start by
>>> asking you some questions. Don't worry if you can't answer them all. 
>>> Just
>>> do
>>> your best and I'll guide you to the rest of the needed answers.
>>>
>>> What version of JAWS are you running? Do you have Adobe Reader on your
>>> computer? If so, which version? Do you have any scanning and reading
>>> products like Kurzweil K1000 or OpenBook installed on your computer? If
>>> so,
>>> what versions?
>>>
>>> There are a number of ways to read PDF documents. Some PDFs are fully
>>> accessible, many can be read with some difficulty and far too many 
>>> remain
>>> completely out of our reach without a significant amount of expensive
>>> assistive technology.
>>>
>>> There is one free solution that can read many PDF documents. It is 
>>> called
>>> PDF2TXT, and it has been developed by a blind computer programmer. Visit
>>> http://www.empowermentzone.com/p2tsetup.exe to install the program.
>>>
>>> Regards,
>>>
>>> Darrell
>>>
>>>
>>>
>>> _______________________________________________
>>> nabs-l mailing list
>>> nabs-l at nfbnet.org
>>> http://www.nfbnet.org/mailman/listinfo/nabs-l_nfbnet.org
>>> To unsubscribe, change your list options or get your account info for
>>> nabs-l:
>>>
>>
> http://www.nfbnet.org/mailman/options/nabs-l_nfbnet.org/raniaismail04%40gmai
>> l.com
>>
>>
>> _______________________________________________
>> nabs-l mailing list
>> nabs-l at nfbnet.org
>> http://www.nfbnet.org/mailman/listinfo/nabs-l_nfbnet.org
>> To unsubscribe, change your list options or get your account info for
>> nabs-l:
>>
> http://www.nfbnet.org/mailman/options/nabs-l_nfbnet.org/darrell.shandrow%40g
>> mail.com
>>
>>
>> _______________________________________________
>> nabs-l mailing list
>> nabs-l at nfbnet.org
>> http://www.nfbnet.org/mailman/listinfo/nabs-l_nfbnet.org
>> To unsubscribe, change your list options or get your account info for
>> nabs-l:
>>
> http://www.nfbnet.org/mailman/options/nabs-l_nfbnet.org/raniaismail04%40gmai
> l.com
>
>
> _______________________________________________
> nabs-l mailing list
> nabs-l at nfbnet.org
> http://www.nfbnet.org/mailman/listinfo/nabs-l_nfbnet.org
> To unsubscribe, change your list options or get your account info for
> nabs-l:
> http://www.nfbnet.org/mailman/options/nabs-l_nfbnet.org/darrell.shandrow%40g
> mail.com
>
>
> _______________________________________________
> nabs-l mailing list
> nabs-l at nfbnet.org
> http://www.nfbnet.org/mailman/listinfo/nabs-l_nfbnet.org
> To unsubscribe, change your list options or get your account info for 
> nabs-l:
> http://www.nfbnet.org/mailman/options/nabs-l_nfbnet.org/raniaismail04%40gmail.com 





More information about the NABS-L mailing list