[Blindmath] Extracting bitmap images from pdf files

Richard Baldwin baldwin at dickbaldwin.com
Fri Jan 27 20:29:25 UTC 2012


Maureen, you wrote:

"Sorry I suppose this is a tangent on the discussion, but if anyone has
ideas, please help."

I actually do have some ideas that may be helpful. In particular, within
the next few weeks, I will publish a free program that can be used to
enhance images for embossing prior to throwing away all of the color
information. In some cases -- I emphasize IN SOME BUT NOT ALL CASES --
experience has shown that performing the enhancement before embossing on a
Tiger can provide tactile images that are easier to understand. In all
cases, however, the tactile image should be accompanied by a good
description.

The file format will be a standard IVEO-compatible SVG file, so if you have
an IVEO system including Creator Pro, I believe you could open (not import)
the SVG files produced by my program into Creator Pro and do the overlay
thing that John mentioned in an earlier post to add audio for use with an
IVEO touchpad.

The program is currently being evaluated by selected users. I will announce
the availability of the program on this list when it becomes available to
everyone.

Dick Baldwin

On Fri, Jan 27, 2012 at 2:15 PM, Lewicki, Maureen
<mlewicki at bcsd.neric.org>wrote:

> Ah, yes I see your point. I knew I was not understanding something. I am
> not sure about the height of the dots, as I just received the equipment,
> but I can say that I have not been impressed by the viewplus graphics in
> the past. I really hope I am wrong. The graphs and diagrams being produced
> for our elementary students are being made in word paint program. I am not
> sure if the problem is the program the producer, or the embosser.
>
> Sorry I suppose this is a tangent on the discussion, but if anyone has
> ideas, please help.
>
> Maureen Murphy Lewicki
> Maureen Murphy Lewicki
> Teacher of Visually Impaired
> Bethlehem Central Schools
> (518)439-7681
> "When we do the best that we can, we never know what miracle is wrought in
> our life, or in the life of another." Helen Keller
>
>
> -----Original Message-----
> From: blindmath-bounces at nfbnet.org [mailto:blindmath-bounces at nfbnet.org]
> On Behalf Of Richard Baldwin
> Sent: Friday, January 27, 2012 3:07 PM
> To: Blind Math list for those interested in mathematics
> Subject: Re: [Blindmath] Extracting bitmap images from pdf files
>
> Maureen,
>
> Does the Tiger emboss the image in color, or does it emboss the image with
> different dot heights of which you can probably distinguish among three or
> four different heights using your fingers?
>
> The conventional wisdom in the sonar world is that an experienced sonar
> operator can visually distinguish only about seven shades of gray. I would
> imagine that an experienced Tiger user would be able to distinguish among
> three or four different dot heights with the fingers -- maybe more, but
> clearly not 16 million. I may be wrong, but I believe that the Tiger
> attempts to produce 8 different dot heights.
>
> That is what I meant when I referred to three or four shades of gray in my
> earlier post. Converting from an image with 16 million colors to a tactile
> image with three or four distinguishable dot heights is analogous to
> converting the image from 16 million colors to three or four shades of gray.
>
> Dick Baldwin
>
> On Fri, Jan 27, 2012 at 1:56 PM, Richard Baldwin <baldwin at dickbaldwin.com
> >wrote:
>
> > In a previous post I wrote:
> >
> > "By the way, I don't know how a blind person would carry out the
> > second of the following two steps in John's procedure:
> >
> > * import the PDF into IVEO Creator Pro.
> > * Check the PDF to find which pages have images of interest and emboss
> > those pages.
> >
> > It seems that checking the pdf to find which pages have images would
> > be similar to checking a screen shot of a page to find and crop the
> > image. It seems that you would need to be able to see the pdf on the
> > IVEO screen to know if it contains an image. I am working with pdf
> > files containing anywhere between 30 and 80 pages. Embossing every
> > page in order to identify the pages that contain images would not be
> practical."
> >
> > I have learned how a blind person could find the pages containing the
> > images in a pdf file without having to see the screen. Here is one
> > procedure for doing that.
> >
> > When you import a pdf file into Creator Pro, a set of SVG files is
> > automatically created in the folder than contains the pdf file. There
> > is one SVG file for each page in the pdf file. The file names indicate
> > the pdf page number except that pages in a pdf file are typically
> > numbered beginning with 1 while the file numbers produced by Creator
> > Pro begin with 0. Thus, file number 0 will probably correspond to page
> > 1 in the pdf document.
> >
> > Read the pdf file in your preferred pdf file reader. If from the pdf
> > text, you can determine which pages in the pdf file contain images of
> > interest, you can record those page numbers using whatever method you
> > use to record information of that sort.
> >
> > Then you can import the pdf file into Creator Pro, producing the set
> > of SVG files described above. Then you can open the SVG files that
> > contain interesting images in your IVEO viewer software, emboss the
> > pages, and proceed as John explained in an earlier post.
> >
> > Dick Baldwin
> >
> >
> > On Fri, Jan 27, 2012 at 12:47 PM, Richard Baldwin
> > <baldwin at dickbaldwin.com
> > > wrote:
> >
> >> Michael wrote " There is one option I am aware of for a blind person
> >> to do this independently, IVEO like John suggested,"
> >>
> >> I may be wrong, but I didn't get the idea that John's solution will
> >> produce an output bitmap file - only an embossed image.
> >>
> >> I may be wrong again, but as near as I can tell, IVEO doesn't do any
> >> image enhancement prior to embossing the image. If I am wrong on
> >> these points, John will probably come online and set the record
> straight.
> >>
> >> IVEO seems to simply convert the bitmap image to gray scale and
> >> emboss the gray scale. While gray scale embossing is okay for some
> >> images (especially blank and white images), it is definitely not the
> >> best option for many images. After all, if you convert 16 million
> >> colors to four levels of gray scale, each level of gray scale
> >> represents 4 million different colors. Pixels belonging to each set
> >> of 4 million colors will not be distinguishable in the gray scale
> representation.
> >>
> >> My objective is to gain access to full-color bitmap images so that I
> >> can enhance the image for embossing prior to throwing away all of the
> >> color information.
> >>
> >> Embossed versions of bitmap images are often very difficult to
> >> understand, even with a decent description. I believe we need to do
> >> everything reasonable to improve the understandability of embossed
> >> bitmap images. In some cases, image enhancement techniques at the
> >> full-color stage can be used to provide those improvements.
> >>
> >> So, my quest continues, hopefully without having to pay $445.00 for
> >> Acrobat Pro, just to get access to the images.
> >>
> >> The fallback position, of course, is to use screen shots and an image
> >> editor program to crop out the individual images, but that approach
> >> is not possible for a blind person to use. You can't crop an image
> >> out of a screen shot unless you can see the image.
> >>
> >> By the way, I don't know how a blind person would carry out the
> >> second of the following two steps in John's procedure:
> >>
> >> * import the PDF into IVEO Creator Pro.
> >> * Check the PDF to find which pages have images of interest and
> >> emboss those pages.
> >>
> >> It seems that checking the pdf to find which pages have images would
> >> be similar to checking a screen shot of a page to find and crop the
> >> image. It seems that you would need to be able to see the pdf on the
> >> IVEO screen to know if it contains an image. I am working with pdf
> >> files containing anywhere between 30 and 80 pages. Embossing every
> >> page in order to identify the pages that contain images would not be
> practical.
> >>
> >> Dick Baldwin
> >>
> >>
> >> On Fri, Jan 27, 2012 at 11:48 AM, Richard Baldwin <
> >> baldwin at dickbaldwin.com> wrote:
> >>
> >>> Amanda and others,
> >>>
> >>> I have contacted Adobe technical support. There solution to the
> >>> problem is to purchase Acrobat Pro for $445.00. The tech support rep
> >>> told me that their program will extract the pictures intact as
> separate bitmap files.
> >>>
> >>> Dick Baldwin
> >>>
> >>>
> >>> On Fri, Jan 27, 2012 at 10:44 AM, Michael Whapples <mwhapples at aim.com
> >wrote:
> >>>
> >>>> Hello,
> >>>> From what you are describing, my feeling is that the
> >>>> diagrams/images in the PDF in question are created from a number of
> >>>> drawing elements rather than a single image object. I'm not an
> >>>> expert on PDF, but I think you could think of it like the
> >>>> difference of a bitmap being a single element (I think PDF has a
> >>>> way to specify the start of a stream object like a bitmap) and an
> >>>> SVG being formed from lots of elements like lines and shapes (I
> >>>> think in PDF the lines and such like can be created with basic PDF
> >>>> drawing facilities so are not in a separate object). When the image
> >>>> is formed from lots of elements then it may be hard for the
> >>>> software to know what makes up a given diagram in the
> >>>> book/document, it just lays it out as specified and you work out
> >>>> what's related. I think one way to tell whether you have this sort
> >>>> of image is to see if NVDA will read some of the text labels of the
> >>>> image, if it does then its not a pure bitmap (you probably could
> >>>> use the read out lout function of adobe reader as well). Therefore I
> imagine that without clever recognition algorithms you are unlikely to get
> something which will extract it as you want.
> >>>>
> >>>> There is one option I am aware of for a blind person to do this
> >>>> independently, IVEO like John suggested, however IVEO isn't a cheap
> >>>> option and depending on how much is to be done would determine
> >>>> whether its worth the money if providing accessible diagrams from
> >>>> PDF was its only use. IVEO does not require a tiger printer, swell
> >>>> paper would work, other embossers may (the outputting from IVEO is
> >>>> the question as I think it may only output to devices appearing as
> >>>> standard printers). Interesting, the IVEO route again is requiring a
> human to make the decision on what forms the diagram.
> >>>>
> >>>> Michael Whapples
> >>>>
> >>>> -----Original Message----- From: Richard Baldwin
> >>>> Sent: Friday, January 27, 2012 3:28 PM
> >>>> To: Jamal Mazrui
> >>>> Cc: Blind Math list for those interested in mathematics
> >>>> Subject: Re: [Blindmath] Extracting bitmap images from pdf files
> >>>>
> >>>>
> >>>> Hi Jamal,
> >>>>
> >>>> It is a great program, easy to use, and probably totally
> >>>> accessible. I particularly like the fact that the program doesn't
> >>>> require a windows installation. The output data is well organized
> >>>> and including the page numbers in the bmp file names is a great help
> in analyzing them.
> >>>>
> >>>> Unfortunately, the output produced by the program suffers from the
> >>>> same issues that I have encountered with all of the other image
> >>>> extractor programs that I have tried. A few of the images come out
> >>>> intact. Most of the images don't come out intact.
> >>>>
> >>>> For example, page three of one of the pdf files that I tested has a
> >>>> single image of a battery. It is the same image that I enhanced and
> >>>> posted in an earlier post. Your program produced 54 bmp files for
> >>>> that page. A few of them were icons such as arrows exclamation
> >>>> marks, etc. The remaining bmp files appear to be a very small
> >>>> pieces of the image of the battery. By the way, I got the earlier
> >>>> image of the battery by taking a screen shot of the page and using
> >>>> an image editing program to crop out the battery image.
> >>>> None
> >>>> of the image extraction programs that I have tested extract the
> >>>> image intact.
> >>>>
> >>>> I don't know anything at all about the internal structure of pdf
> >>>> files, and this behavior of breaking an image into many small
> >>>> pieces may depend on how the file is constructed in the first
> >>>> place. In any event, my immediate problem has to do with a specific
> >>>> set of pdf files that are the chapters from a specific physics
> >>>> book, so this program doesn't solve my problem.
> >>>>
> >>>> Thanks for offering the program.
> >>>> Dick Baldwin
> >>>>
> >>>> On Fri, Jan 27, 2012 at 5:18 AM, Jamal Mazrui <empower at smart.net>
> >>>> wrote:
> >>>>
> >>>>  In an attempt to facilitate a free, non-web dependent solution, I
> >>>> have
> >>>>> written a Windows console-mode utility called PDF2Images, built
> >>>>> with PowerBASIC and a PDF library.  The distribution archive,
> >>>>> including documentation and source code, is available at
> >>>>>
> >>>>> http://empowermentzone.com/****pdf2images.zip<http://empowermentzo
> >>>>> ne.com/**pdf2images.zip>
> >>>>> <http://**empowermentzone.com/**pdf2images.zip<http://empowermentz
> >>>>> one.com/pdf2images.zip>
> >>>>> >
> >>>>>
> >>>>>
> >>>>> I am interested in any feedback on how well it works compared to
> >>>>> other approaches.
> >>>>>
> >>>>> Jamal
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>> --
> >>>> Richard G. Baldwin (Dick Baldwin)
> >>>> Home of Baldwin's on-line Java Tutorials http://www.DickBaldwin.com
> >>>>
> >>>> Professor of Computer Information Technology Austin Community
> >>>> College
> >>>> (512) 223-4758
> >>>> mailto:Baldwin at DickBaldwin.com
> >>>> http://www.austincc.edu/**baldwin/
> >>>> <http://www.austincc.edu/baldwin/>
> >>>> ______________________________**_________________
> >>>> Blindmath mailing list
> >>>> Blindmath at nfbnet.org
> >>>> http://nfbnet.org/mailman/**listinfo/blindmath_nfbnet.org<http://nf
> >>>> bnet.org/mailman/listinfo/blindmath_nfbnet.org>
> >>>> To unsubscribe, change your list options or get your account info
> >>>> for
> >>>> Blindmath:
> >>>> http://nfbnet.org/mailman/**options/blindmath_nfbnet.org/**
> >>>> mwhapples%40aim.com<http://nfbnet.org/mailman/options/blindmath_nfb
> >>>> net.org/mwhapples%40aim.com>
> >>>>
> >>>> ______________________________**_________________
> >>>> Blindmath mailing list
> >>>> Blindmath at nfbnet.org
> >>>> http://nfbnet.org/mailman/**listinfo/blindmath_nfbnet.org<http://nf
> >>>> bnet.org/mailman/listinfo/blindmath_nfbnet.org>
> >>>> To unsubscribe, change your list options or get your account info
> >>>> for
> >>>> Blindmath:
> >>>> http://nfbnet.org/mailman/**options/blindmath_nfbnet.org/**
> >>>> baldwin%40dickbaldwin.com<http://nfbnet.org/mailman/options/blindma
> >>>> th_nfbnet.org/baldwin%40dickbaldwin.com>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Richard G. Baldwin (Dick Baldwin)
> >>> Home of Baldwin's on-line Java Tutorials http://www.DickBaldwin.com
> >>>
> >>> Professor of Computer Information Technology Austin Community
> >>> College
> >>> (512) 223-4758
> >>> mailto:Baldwin at DickBaldwin.com
> >>> http://www.austincc.edu/baldwin/
> >>>
> >>
> >>
> >>
> >> --
> >> Richard G. Baldwin (Dick Baldwin)
> >> Home of Baldwin's on-line Java Tutorials http://www.DickBaldwin.com
> >>
> >> Professor of Computer Information Technology Austin Community College
> >> (512) 223-4758
> >> mailto:Baldwin at DickBaldwin.com
> >> http://www.austincc.edu/baldwin/
> >>
> >
> >
> >
> > --
> > Richard G. Baldwin (Dick Baldwin)
> > Home of Baldwin's on-line Java Tutorials http://www.DickBaldwin.com
> >
> > Professor of Computer Information Technology Austin Community College
> > (512) 223-4758
> > mailto:Baldwin at DickBaldwin.com
> > http://www.austincc.edu/baldwin/
> >
>
>
>
> --
> Richard G. Baldwin (Dick Baldwin)
> Home of Baldwin's on-line Java Tutorials http://www.DickBaldwin.com
>
> Professor of Computer Information Technology Austin Community College
> (512) 223-4758
> mailto:Baldwin at DickBaldwin.com
> http://www.austincc.edu/baldwin/
> _______________________________________________
> Blindmath mailing list
> Blindmath at nfbnet.org
> http://nfbnet.org/mailman/listinfo/blindmath_nfbnet.org
> To unsubscribe, change your list options or get your account info for
> Blindmath:
>
> http://nfbnet.org/mailman/options/blindmath_nfbnet.org/mlewicki%40bcsd.neric.org
>
> _______________________________________________
> Blindmath mailing list
> Blindmath at nfbnet.org
> http://nfbnet.org/mailman/listinfo/blindmath_nfbnet.org
> To unsubscribe, change your list options or get your account info for
> Blindmath:
>
> http://nfbnet.org/mailman/options/blindmath_nfbnet.org/baldwin%40dickbaldwin.com
>



-- 
Richard G. Baldwin (Dick Baldwin)
Home of Baldwin's on-line Java Tutorials
http://www.DickBaldwin.com

Professor of Computer Information Technology
Austin Community College
(512) 223-4758
mailto:Baldwin at DickBaldwin.com
http://www.austincc.edu/baldwin/



More information about the BlindMath mailing list