[Blindmath] Extracting bitmap images from pdf files

Amanda Lacy lacy925 at gmail.com
Fri Jan 27 18:03:16 UTC 2012


I'd think it would be better to put that amount of money toward an IVEO 
system.

What about the publisher? Do you think McGraw Hill has copies of the images 
separate from the text?

Amanda
----- Original Message ----- 
From: "Richard Baldwin" <baldwin at dickbaldwin.com>
To: "Blind Math list for those interested in mathematics" 
<blindmath at nfbnet.org>
Sent: Friday, January 27, 2012 11:48 AM
Subject: Re: [Blindmath] Extracting bitmap images from pdf files


> Amanda and others,
>
> I have contacted Adobe technical support. There solution to the problem is
> to purchase Acrobat Pro for $445.00. The tech support rep told me that
> their program will extract the pictures intact as separate bitmap files.
>
> Dick Baldwin
>
> On Fri, Jan 27, 2012 at 10:44 AM, Michael Whapples 
> <mwhapples at aim.com>wrote:
>
>> Hello,
>> From what you are describing, my feeling is that the diagrams/images in
>> the PDF in question are created from a number of drawing elements rather
>> than a single image object. I'm not an expert on PDF, but I think you 
>> could
>> think of it like the difference of a bitmap being a single element (I 
>> think
>> PDF has a way to specify the start of a stream object like a bitmap) and 
>> an
>> SVG being formed from lots of elements like lines and shapes (I think in
>> PDF the lines and such like can be created with basic PDF drawing
>> facilities so are not in a separate object). When the image is formed 
>> from
>> lots of elements then it may be hard for the software to know what makes 
>> up
>> a given diagram in the book/document, it just lays it out as specified 
>> and
>> you work out what's related. I think one way to tell whether you have 
>> this
>> sort of image is to see if NVDA will read some of the text labels of the
>> image, if it does then its not a pure bitmap (you probably could use the
>> read out lout function of adobe reader as well). Therefore I imagine that
>> without clever recognition algorithms you are unlikely to get something
>> which will extract it as you want.
>>
>> There is one option I am aware of for a blind person to do this
>> independently, IVEO like John suggested, however IVEO isn't a cheap 
>> option
>> and depending on how much is to be done would determine whether its worth
>> the money if providing accessible diagrams from PDF was its only use. 
>> IVEO
>> does not require a tiger printer, swell paper would work, other embossers
>> may (the outputting from IVEO is the question as I think it may only 
>> output
>> to devices appearing as standard printers). Interesting, the IVEO route
>> again is requiring a human to make the decision on what forms the 
>> diagram.
>>
>> Michael Whapples
>>
>> -----Original Message----- From: Richard Baldwin
>> Sent: Friday, January 27, 2012 3:28 PM
>> To: Jamal Mazrui
>> Cc: Blind Math list for those interested in mathematics
>> Subject: Re: [Blindmath] Extracting bitmap images from pdf files
>>
>>
>> Hi Jamal,
>>
>> It is a great program, easy to use, and probably totally accessible. I
>> particularly like the fact that the program doesn't require a windows
>> installation. The output data is well organized and including the page
>> numbers in the bmp file names is a great help in analyzing them.
>>
>> Unfortunately, the output produced by the program suffers from the same
>> issues that I have encountered with all of the other image extractor
>> programs that I have tried. A few of the images come out intact. Most of
>> the images don't come out intact.
>>
>> For example, page three of one of the pdf files that I tested has a 
>> single
>> image of a battery. It is the same image that I enhanced and posted in an
>> earlier post. Your program produced 54 bmp files for that page. A few of
>> them were icons such as arrows exclamation marks, etc. The remaining bmp
>> files appear to be a very small pieces of the image of the battery. By 
>> the
>> way, I got the earlier image of the battery by taking a screen shot of 
>> the
>> page and using an image editing program to crop out the battery image. 
>> None
>> of the image extraction programs that I have tested extract the image
>> intact.
>>
>> I don't know anything at all about the internal structure of pdf files, 
>> and
>> this behavior of breaking an image into many small pieces may depend on 
>> how
>> the file is constructed in the first place. In any event, my immediate
>> problem has to do with a specific set of pdf files that are the chapters
>> from a specific physics book, so this program doesn't solve my problem.
>>
>> Thanks for offering the program.
>> Dick Baldwin
>>
>> On Fri, Jan 27, 2012 at 5:18 AM, Jamal Mazrui <empower at smart.net> wrote:
>>
>>  In an attempt to facilitate a free, non-web dependent solution, I have
>>> written a Windows console-mode utility called PDF2Images, built with
>>> PowerBASIC and a PDF library.  The distribution archive, including
>>> documentation and source code, is available at
>>>
>>> http://empowermentzone.com/****pdf2images.zip<http://empowermentzone.com/**pdf2images.zip>
>>> <http://**empowermentzone.com/**pdf2images.zip<http://empowermentzone.com/pdf2images.zip>
>>> >
>>>
>>>
>>> I am interested in any feedback on how well it works compared to other
>>> approaches.
>>>
>>> Jamal
>>>
>>>
>>>
>>>
>>
>> --
>> Richard G. Baldwin (Dick Baldwin)
>> Home of Baldwin's on-line Java Tutorials
>> http://www.DickBaldwin.com
>>
>> Professor of Computer Information Technology
>> Austin Community College
>> (512) 223-4758
>> mailto:Baldwin at DickBaldwin.com
>> http://www.austincc.edu/**baldwin/ <http://www.austincc.edu/baldwin/>
>> ______________________________**_________________
>> Blindmath mailing list
>> Blindmath at nfbnet.org
>> http://nfbnet.org/mailman/**listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/listinfo/blindmath_nfbnet.org>
>> To unsubscribe, change your list options or get your account info for
>> Blindmath:
>> http://nfbnet.org/mailman/**options/blindmath_nfbnet.org/**
>> mwhapples%40aim.com<http://nfbnet.org/mailman/options/blindmath_nfbnet.org/mwhapples%40aim.com>
>>
>> ______________________________**_________________
>> Blindmath mailing list
>> Blindmath at nfbnet.org
>> http://nfbnet.org/mailman/**listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/listinfo/blindmath_nfbnet.org>
>> To unsubscribe, change your list options or get your account info for
>> Blindmath:
>> http://nfbnet.org/mailman/**options/blindmath_nfbnet.org/**
>> baldwin%40dickbaldwin.com<http://nfbnet.org/mailman/options/blindmath_nfbnet.org/baldwin%40dickbaldwin.com>
>>
>
>
>
> -- 
> Richard G. Baldwin (Dick Baldwin)
> Home of Baldwin's on-line Java Tutorials
> http://www.DickBaldwin.com
>
> Professor of Computer Information Technology
> Austin Community College
> (512) 223-4758
> mailto:Baldwin at DickBaldwin.com
> http://www.austincc.edu/baldwin/
> _______________________________________________
> Blindmath mailing list
> Blindmath at nfbnet.org
> http://nfbnet.org/mailman/listinfo/blindmath_nfbnet.org
> To unsubscribe, change your list options or get your account info for 
> Blindmath:
> http://nfbnet.org/mailman/options/blindmath_nfbnet.org/lacy925%40gmail.com 





More information about the BlindMath mailing list