[Blindmath] Extracting bitmap images from pdf files
Richard Baldwin
baldwin at dickbaldwin.com
Mon Jan 30 03:03:23 UTC 2012
Jamal,
I just realized that the latest version does suppress the output of bmp,
jpg, and txt files.
Thanks,
Dick Baldwin
On Sun, Jan 29, 2012 at 8:55 PM, Richard Baldwin <baldwin at dickbaldwin.com>wrote:
> Great work Jamal,
>
> The program works great. I ran it on the largest pdf file in the set for
> Amanda's physics book with no problems.
>
> Here is what would work well for me.
>
> Two exe files in the same package -- one for 150 bpi and the other for 300
> bpi. The large (300 bpi) pages are hard to deal with on a small monitor,
> but they make it possible to go in and crop out high quality versions of
> small images that were created with Adobe vector graphics.
>
> On the other hand, the small (150 bpi) pages are entirely adequate for
> cropping out images that were originally bitmap images or large vector
> images. And, the small pages are easier to work with.
>
> Therefore, both versions are useful.
>
> If practical, suppress the output of the bmp, jpg, and txt files. I don't
> need them. If not practical, don't worry about it. It is easy enough to
> delete them.
>
> Thanks for taking the initiative and doing this.
>
> Dick Baldwin
>
>
> On Sun, Jan 29, 2012 at 6:02 PM, Jamal Mazrui <empower at smart.net> wrote:
>
>> Hi Dick,
>> Sorry my prior message was not clear about this. After copying the new
>> pdf2images.exe into the directory you used for PDF2Parts, you would then
>> run pdf2images.exe, passing it the file name of the PDF to analyze. I
>> suspect that you instead ran pdf2parts.exe again, which would, indeed,
>> produce the same result as before.
>>
>> I just tried this pdf2images.exe with a book that is 873 pages in size.
>> It appeared to create a .TIF for each page.
>>
>> For just converting PDFs to text, let me suggest my older, PDF2TXT
>> program, based on the same PDF library. It can convert batches of PDF with
>> a simple GUI dialog. It can also do OCR on image-only PDFs using the free,
>> open source Tesseract utility from Google. That OCR is not high quality by
>> today's standards.
>>
>> PDF2TXT is available as a Windows installer at
>>
>> http://EmpowermentZone.com/**p2tsetup.exe<http://EmpowermentZone.com/p2tsetup.exe>
>>
>> Its full documentation may be browsed at
>>
>> http://empowermentzone.com/**pdf2txt.htm<http://empowermentzone.com/pdf2txt.htm>
>>
>> Jamal
>>
>>
>>
>>
>> On 1/29/2012 6:42 PM, Richard Baldwin wrote:
>> > Hi Jamal,
>> > The output from this version is not much different from the previous
>> version. The program still crashed on page 17 of the small pdf file. I also
>> noticed that it skipped page 13.
>> > I tried a larger pdf file and it crashed on page 6 of that file.
>> > I don't believe the tiff files were actually created at 300 dpi. The
>> width of those files is 1275 pixels, which matches 8.5 inches at 150 dpi.
>> > I did discover one thing that may be different. Although I was unable
>> to successfully open the jpg files in Lview Pro, which is the image editor
>> program that I have used for years, I was able to successfully open them in
>> Windows Paint and also in a program named Paint.net that I occasionally
>> use. That was probably also true for the earlier version. I simply didn't
>> try it. Curiously, the jpg files seemed to be in reverse video when opened
>> in those paint programs.
>> > Don't spend time worrying about the jpg files. They add very little
>> benefit to the overall result. As far as I am concerned, you could suppress
>> the output of images from the individual pages, because they are of little
>> value.
>> > Amanda might be happy with the .txt files that appear to contain the
>> text from the pdf file in a plain text format on a page by page basis.
>> > Dick Baldwin
>> >
>> > On Sun, Jan 29, 2012 at 4:54 PM, Jamal Mazrui <empower at smart.net>
>> wrote:
>> >
>> > Hi Dick,
>> > With the PDF library I have, I do not see a way of adjusting the
>> format of JPG output, other than the DPI setting, unfortunately. Perhaps
>> the free Image Magick software could transform those files into something
>> more useful -- not sure.
>> >
>> > I think I may have found a way, however, to improve the reliability
>> of simply producing a TIF file for each whole page of the PDF. The library
>> has a function call for this that processes all pages at once. Memory
>> seems to be managed better than when iterating through each page of the PDF
>> separately, which I suspect is causing the crashes with PDFs that are not
>> relatively small in size.
>> >
>> > I just posted a utility that only does that task at 300 DPI. It
>> has the original PDF2Images name and is available at
>> >
>> > http://EmpowermentZone.com/**pdf2images.zip<http://EmpowermentZone.com/pdf2images.zip>
>> >
>> > Just unzip it to the same directory as PDF2Parts (it uses the same
>> PDF2Parts.dll).
>> >
>> > A minor annoyance is that this technique does not right justify
>> page numbers (the single function call mostly handles the names of
>> individual .tif files). So, the output files do not sort correctly in an
>> alphabetical directory listing. If files are sorted by time, however, the
>> right order is attained.
>> >
>> > Can you let me know how well this utility works? If I get it
>> working adequately, I will incorporate it into a single, coherent package.
>> >
>> > Jamal
>> >
>> >
>> >
>> > On 1/29/2012 4:47 PM, Richard Baldwin wrote:
>> >> Hi Jamal,
>> >> I ran the new version of the program for a relatively small pdf
>> file, which was one of the smallest chapters in the physics textbook. The
>> program stopped with an error on page 17 of about 24 pages. However, it did
>> produce a lot of output before stopping.
>> >> The tiff files that represent individual pages look good. If
>> possible, I would like to see if conversion to 300 dpi as opposed to 150
>> dpi would provide improved image quality.
>> >> The bmp and jpg files for the individual images on each page
>> suffer from the same problem discussed in previous posts. Mostly small
>> pieces of larger images. In addition, the jpg files appear to be corrupt.
>> They appear to suffer from some sort of synchronization problem that causes
>> them to consist mainly of vertical bars. However, it was possible for me to
>> correlate one of them to an actual image in the book. I suspect that these
>> are the images from the pdf file that are stored as raster images in the
>> pdf file.
>> >> Once you get the program to handle complete pdf files, I will
>> consider it superior to online conversion of pdf files to bitmap pages. If
>> you can fix the problem with the jpg files, that would be useful because
>> they contain images that a sighted assistant won't need to crop out of the
>> larger page images.
>> >> Thanks,
>> >> Dick Baldwin
>> >>
>> >> On Sun, Jan 29, 2012 at 11:21 AM, Jamal Mazrui <empower at smart.net>
>> wrote:
>> >>
>> >> Dick,
>> >> I just posted a revised and renamed version of my program,
>> which extracts both text and images. PDF2Parts is available at
>> >> http://EmpowermentZone.com/**pdf2parts.zip<http://EmpowermentZone.com/pdf2parts.zip>
>> >>
>> >> Currently, it creates a .tif version of each PDF page at 150
>> DPI. Alternatively, I could make it save as .bmp or .jpg, and vary the
>> resolution. Would another image format or DPI work better for what you are
>> trying to do?
>> >>
>> >> Jamal
>> >>
>> >> P.S. The program seems to hang on large PDFs sometimes. I
>> have not figured out the pattern and debugged that yet.
>> >>
>> >>
>> >> On 1/28/2012 2:29 PM, Richard Baldwin wrote:
>> >>
>> >> I will be responding to questions and comments from
>> several different
>> >> individuals in this post, so I will refer to each person
>> by name.
>> >>
>> >> Maureen: I will be happy to send some files off list for
>> you to emboss and
>> >> evaluate if you would be interested in doing that. I would
>> be interested in
>> >> your feed back.
>> >>
>> >> Jamal: You wrote "In reviewing the documentation for the
>> PDF library I'm
>> >> using, I notice there is also the ability to save each
>> page as an image.
>> >> Would that be helpful?"
>> >>
>> >> That would be very helpful. I have generally concluded
>> (more on this in a
>> >> separate post) that the most practical way for a sighted
>> person to extract
>> >> images from a pdf file for a blind student is to deal with
>> each page as an
>> >> image file, crop, cut, copy, and paste. I have identified
>> a free website
>> >> that will convert a pdf file to a set of image files, but
>> the less often I
>> >> am required to download files from strange websites, the
>> happier I am. I
>> >> never know what may be riding those files into my
>> computer. Your
>> >> stand-alone command-line based program would make it
>> possible to make the
>> >> conversion locally. Please provide more information.
>> >>
>> >> Ben: You wrote "I have a question -- are you using the
>> most popular
>> >> university Physics textbook, whatever that may be?"
>> >>
>> >> Actually, I teach Computer Science and not physics. Amanda
>> is a Computer
>> >> Science student, and I am helping her in a required
>> physics course. Her
>> >> physics book is the only one that I know anything about.
>> However, I believe
>> >> this pdf-image issue applies to many college-level
>> textbooks, because many
>> >> blind college students probably receive their electronic
>> textbooks in pdf
>> >> format. Once again, however, the only one that I have any
>> personal
>> >> knowledge about is Amanda's physics book.
>> >>
>> >> I will send you a pdf file of one of the chapters from the
>> textbook off
>> >> list later today.
>> >>
>> >> Bente: You wrote "If we could stick with a text for more
>> than two years it
>> >> would be so helpful."
>> >>
>> >> I will simply say a loud AMEN to that. In my 18 years of
>> teaching, I have
>> >> never understood why community colleg instructors insist
>> on changing
>> >> textbooks so frequently, causing much more work for
>> themselves in the
>> >> process. I have gotten to the point that I tell my
>> students that the
>> >> textbook is for reference purposes only and the material
>> for the course is
>> >> published at http://www.dickbaldwin.com.
>> >>
>> >> Dick Baldwin
>> >>
>> >> On Sat, Jan 28, 2012 at 1:01 PM, Bente Casile<
>> bente at casilenc.com> wrote:
>> >>
>> >> Ben,
>> >>
>> >> My greatest wish for all the blind students out there
>> is that we in the
>> >> college system could have a repository of tactile
>> graphics for science and
>> >> math classes. If we could stick with a text for more
>> than two years it
>> >> would be so helpful. As someone who makes math
>> tactile graphics for our
>> >> students, I would love to see that happen. It would
>> allow us to get ahead
>> >> for students to benefit directly from the hard work of
>> others and not to
>> >> have to "re-invent" the wheel every time a new text
>> is adopted.
>> >>
>> >> Oh, and PS .. Austin is very nice..smiles
>> >>
>> >> Bente
>> >> Bente J. Casile
>> >> Math Learning Specialist
>> >> Wake Tech Community College
>> >> Raleigh NC
>> >>
>> >> -----Original Message-----
>> >> From: blindmath-bounces at nfbnet.org [mailto:
>> blindmath-bounces@**nfbnet.org <blindmath-bounces at nfbnet.org>]
>> >> On
>> >> Behalf Of Ben Humphreys
>> >> Sent: Saturday, January 28, 2012 11:17 AM
>> >> To: Blind Math list for those interested in mathematics
>> >> Subject: Re: [Blindmath] Extracting bitmap images from
>> pdf files
>> >>
>> >> Hi Richard,
>> >>
>> >> As best I recall, it was a Microsoft Word file which
>> we typically
>> >> "saved as" HTML in order to get the graphics extracted
>> out in an
>> >> automated way. Some came out as GIF, others JPEG,
>> leading me to
>> >> believe that Word preserves the original file intact.
>> These were
>> >> .DOC, not .DOCX, so I don't believe they were really
>> ZIP files in
>> >> DOCX clothing.
>> >>
>> >> As my instructor routinely"pasted" in JPGs, GIFs, etc
>> from all around
>> >> the world into her Microsoft Word files, it's anyone's
>> guess why a
>> >> few got all broken up like that. Most remained intact.
>> >>
>> >> Part way through the class, I ended up having my
>> assistant extract by
>> >> hand the images as the automated way was too difficult
>> to distinguish
>> >> the garbage (i.e. little arrows and such) from the
>> meaningful calculus
>> >> graphs.
>> >>
>> >> I have a question -- are you using the most popular
>> university
>> >> Physics textbook, whatever that may be? If so, and we
>> get to the
>> >> bottom of this, we could conceivably have a repository
>> of labeled
>> >> graphics files so others wouldn't have to repeat this
>> step, and joy
>> >> of joys, I could take physics without moving to
>> Austin, :) This of
>> >> course is not to say Austin isn't a great place, it's
>> just that I
>> >> might have to move again when I want to take biology
>> or chemestry.
>> >>
>> >> As always, thanks for your continued enthusiasm.
>> >>
>> >> And as I said, you're welcome to send me a file or two
>> and we'll
>> >> throw our Acrobat Pro strategy at it, maybe even
>> consider how it
>> >> might be automated.
>> >>
>> >> Ben
>> >>
>> >> At 08:59 AM 1/28/2012, you wrote:
>> >>
>> >> But, no, I do not believe we were dealing with PDFs in
>> this case.
>> >>
>> >> Typically, when we have a PDF with a graphic, my
>> assistant draws a
>> >> box around it I think and saves it out separately.
>> I'm not clear on
>> >> the process but he did say it required Acrobat Pro and
>> once it's
>> >> extracted, it's easy to blow it up to fill the page
>> for easier
>> >> tactile understanding.
>> >>
>> >>
>> >> Hi Ben,
>> >>
>> >> I appreciate your frustration.
>> >>
>> >> Were the "30 itty bitty graphics files" that
>> apparently were small parts
>> >> of two actual graphs produced using Acrobat Pro,
>> or were you using some
>> >> different image extraction software during that
>> lost weekend?
>> >>
>> >> Thanks,
>> >> Dick Baldwin
>> >>
>> >> On Sat, Jan 28, 2012 at 5:55 AM, Ben Humphreys
>> >> <brh at opticinspiration.org>**wrote:
>> >>
>> >> I suppose this procedure could work.
>> >>
>> >> But when it's this much effort to get to the
>> starting gate, while other
>> >> students are already moving forward and you're
>> falling behind, it's no
>> >>
>> >> fun,
>> >>
>> >> and the added time and complexity and
>> brainpower just takes all the
>> >> motivation out of you.
>> >>
>> >> I remember losing a whole weekend to the issue
>> of 30 itty bitty
>> >>
>> >> graphics
>> >>
>> >> files in a Calculus PDF. Having embossed
>> them, they were all told to
>> >>
>> >> "fit
>> >>
>> >> to page" and were thusly huge. I was thinking
>> they were all graphs and
>> >> problems to be interpreted and worked on and
>> understood, only to be
>> >>
>> >> told
>> >>
>> >> later that there were only two graphs and
>> having the benefit of a
>> >>
>> >> sighted
>> >>
>> >> person on Monday morning to finally tell me
>> that they were bits and
>> >>
>> >> pieces
>> >>
>> >> of the two relatively simple graphs.
>> >>
>> >> It's enough to make you want to be a Steve
>> Jobs and exit school
>> >> prematurely.
>> >>
>> >> Prof Baldwin, this is certainly not to say I
>> don't appreciate all your
>> >> effforts. In fact, if and when I ever need to
>> take physics, I am
>> >>
>> >> seriously
>> >>
>> >> considering relocating to Austin for a
>> semester.
>> >>
>> >> P.S. I do have Acrobat pro so if you can send
>> me the single page PDF in
>> >> question, we can attempt to extract as a
>> single image.
>> >>
>> >> Ben
>> >>
>> >>
>> >> At 02:56 PM 1/27/2012, you wrote:
>> >>
>> >> In a previous post I wrote:
>> >>
>> >> "By the way, I don't know how a blind
>> person would carry out the
>> >>
>> >> second
>> >> of
>> >>
>> >> the following two steps in John's
>> procedure:
>> >>
>> >> * import the PDF into IVEO Creator Pro.
>> >> * Check the PDF to find which pages have
>> images of interest and emboss
>> >> those
>> >> pages.
>> >>
>> >> It seems that checking the pdf to find
>> which pages have images would
>> >>
>> >> be
>> >>
>> >> similar to checking a screen shot of a
>> page to find and crop the
>> >>
>> >> image.
>> >> It
>> >>
>> >> seems that you would need to be able to
>> see the pdf on the IVEO screen
>> >>
>> >> to
>> >>
>> >> know if it contains an image. I am working
>> with pdf files containing
>> >> anywhere between 30 and 80 pages.
>> Embossing every page in order to
>> >> identify
>> >> the pages that contain images would not be
>> practical."
>> >>
>> >> I have learned how a blind person could
>> find the pages containing the
>> >> images in a pdf file without having to see
>> the screen. Here is one
>> >> procedure for doing that.
>> >>
>> >> When you import a pdf file into Creator
>> Pro, a set of SVG files is
>> >> automatically created in the folder than
>> contains the pdf file. There
>> >>
>> >> is
>> >>
>> >> one SVG file for each page in the pdf
>> file. The file names indicate
>> >>
>> >> the
>> >>
>> >> pdf
>> >> page number except that pages in a pdf
>> file are typically numbered
>> >> beginning with 1 while the file numbers
>> produced by Creator Pro begin
>> >>
>> >> with
>> >>
>> >> 0. Thus, file number 0 will probably
>> correspond to page 1 in the pdf
>> >> document.
>> >>
>> >> Read the pdf file in your preferred pdf
>> file reader. If from the pdf
>> >>
>> >> text,
>> >>
>> >> you can determine which pages in the pdf
>> file contain images of
>> >>
>> >> interest,
>> >>
>> >> you can record those page numbers using
>> whatever method you use to
>> >>
>> >> record
>> >>
>> >> information of that sort.
>> >>
>> >> Then you can import the pdf file into
>> Creator Pro, producing the set
>> >>
>> >> of
>> >>
>> >> SVG
>> >> files described above. Then you can open
>> the SVG files that contain
>> >> interesting images in your IVEO viewer
>> software, emboss the pages, and
>> >> proceed as John explained in an earlier
>> post.
>> >>
>> >> Dick Baldwin
>> >>
>> >> On Fri, Jan 27, 2012 at 12:47 PM, Richard
>> Baldwin
>> >> <baldwin at dickbaldwin.com>****wrote:
>> >>
>> >> Michael wrote " There is one option I
>> am aware of for a blind person
>> >>
>> >> to
>> >>
>> >> do this independently, IVEO like John
>> suggested,"
>> >>
>> >> I may be wrong, but I didn't get the
>> idea that John's solution will
>> >> produce an output bitmap file - only
>> an embossed image.
>> >>
>> >> I may be wrong again, but as near as I
>> can tell, IVEO doesn't do any
>> >>
>> >> image
>> >>
>> >> enhancement prior to embossing the
>> image. If I am wrong on these
>> >>
>> >> points,
>> >>
>> >> John will probably come online and set
>> the record straight.
>> >>
>> >> IVEO seems to simply convert the
>> bitmap image to gray scale and
>> >>
>> >> emboss
>> >>
>> >> the
>> >>
>> >> gray scale. While gray scale embossing
>> is okay for some images
>> >>
>> >> (especially
>> >>
>> >> blank and white images), it is
>> definitely not the best option for
>> >>
>> >> many
>> >>
>> >> images. After all, if you convert 16
>> million colors to four levels
>> >>
>> >> of
>> >>
>> >> gray
>> >>
>> >> scale, each level of gray scale
>> represents 4 million different
>> >>
>> >> colors.
>> >>
>> >> Pixels belonging to each set of 4
>> million colors will not be
>> >> distinguishable in the gray scale
>> representation.
>> >>
>> >> My objective is to gain access to
>> full-color bitmap images so that I
>> >>
>> >> can
>> >>
>> >> enhance the image for embossing prior
>> to throwing away all of the
>> >>
>> >> color
>> >>
>> >> information.
>> >>
>> >> Embossed versions of bitmap images are
>> often very difficult to
>> >>
>> >> understand,
>> >>
>> >> even with a decent description. I
>> believe we need to do everything
>> >> reasonable to improve the
>> understandability of embossed bitmap
>> >>
>> >> images.
>> >>
>> >> In
>> >>
>> >> some cases, image enhancement
>> techniques at the full-color stage can
>> >>
>> >> be
>> >>
>> >> used to provide those improvements.
>> >>
>> >> So, my quest continues, hopefully
>> without having to pay $445.00 for
>> >> Acrobat Pro, just to get access to the
>> images.
>> >>
>> >> The fallback position, of course, is
>> to use screen shots and an
>> >>
>> >> image
>> >>
>> >> editor program to crop out the
>> individual images, but that approach
>> >>
>> >> is
>> >>
>> >> not
>> >>
>> >> possible for a blind person to use.
>> You can't crop an image out of a
>> >>
>> >> screen
>> >>
>> >> shot unless you can see the image.
>> >>
>> >> By the way, I don't know how a blind
>> person would carry out the
>> >>
>> >> second
>> >>
>> >> of
>> >>
>> >> the following two steps in John's
>> procedure:
>> >>
>> >> * import the PDF into IVEO Creator Pro.
>> >> * Check the PDF to find which pages
>> have images of interest and
>> >>
>> >> emboss
>> >>
>> >> those
>> >> pages.
>> >>
>> >> It seems that checking the pdf to find
>> which pages have images would
>> >>
>> >> be
>> >>
>> >> similar to checking a screen shot of a
>> page to find and crop the
>> >>
>> >> image.
>> >>
>> >> It
>> >>
>> >> seems that you would need to be able
>> to see the pdf on the IVEO
>> >>
>> >> screen
>> >>
>> >> to
>> >>
>> >> know if it contains an image. I am
>> working with pdf files containing
>> >> anywhere between 30 and 80 pages.
>> Embossing every page in order to
>> >>
>> >> identify
>> >>
>> >> the pages that contain images would
>> not be practical.
>> >>
>> >> Dick Baldwin
>> >>
>> >>
>> >> On Fri, Jan 27, 2012 at 11:48 AM,
>> Richard Baldwin<
>> >>
>> >> baldwin at dickbaldwin.com
>> >>
>> >> wrote:
>> >> Amanda and others,
>> >>
>> >> I have contacted Adobe technical
>> support. There solution to the
>> >>
>> >> problem
>> >>
>> >> is to purchase Acrobat Pro for
>> $445.00. The tech support rep told
>> >>
>> >> me
>> >>
>> >> that
>> >>
>> >> their program will extract the
>> pictures intact as separate bitmap
>> >>
>> >> files.
>> >>
>> >> Dick Baldwin
>> >>
>> >>
>> >> On Fri, Jan 27, 2012 at 10:44 AM,
>> Michael Whapples
>> >>
>> >> <mwhapples at aim.com
>> >>
>> >> wrote:
>> >>
>> >> Hello,
>> >> From what you are describing,
>> my feeling is that the
>> >>
>> >> diagrams/images
>> >>
>> >> in
>> >>
>> >> the PDF in question are
>> created from a number of drawing elements
>> >>
>> >> rather
>> >>
>> >> than a single image object.
>> I'm not an expert on PDF, but I think
>> >>
>> >> you
>> >>
>> >> could
>> >>
>> >> think of it like the
>> difference of a bitmap being a single element
>> >>
>> >> (I
>> >>
>> >> think
>> >>
>> >> PDF has a way to specify the
>> start of a stream object like a
>> >>
>> >> bitmap)
>> >>
>> >> and an
>> >>
>> >> SVG being formed from lots of
>> elements like lines and shapes (I
>> >>
>> >> think
>> >>
>> >> in
>> >>
>> >> PDF the lines and such like
>> can be created with basic PDF drawing
>> >> facilities so are not in a
>> separate object). When the image is
>> >>
>> >> formed
>> >>
>> >> from
>> >>
>> >> lots of elements then it may
>> be hard for the software to know what
>> >>
>> >> makes up
>> >>
>> >> a given diagram in the
>> book/document, it just lays it out as
>> >>
>> >> specified and
>> >>
>> >> you work out what's related. I
>> think one way to tell whether you
>> >>
>> >> have
>> >>
>> >> this
>> >>
>> >> sort of image is to see if
>> NVDA will read some of the text labels
>> >>
>> >> of
>> >>
>> >> the
>> >>
>> >> image, if it does then its not
>> a pure bitmap (you probably could
>> >>
>> >> use
>> >>
>> >> the
>> >>
>> >> read out lout function of
>> adobe reader as well). Therefore I
>> >>
>> >> imagine
>> >>
>> >> that
>> >>
>> >> without clever recognition
>> algorithms you are unlikely to get
>> >>
>> >> something
>> >>
>> >> which will extract it as you
>> want.
>> >>
>> >> There is one option I am aware
>> of for a blind person to do this
>> >> independently, IVEO like John
>> suggested, however IVEO isn't a
>> >>
>> >> cheap
>> >>
>> >> option
>> >>
>> >> and depending on how much is
>> to be done would determine whether
>> >>
>> >> its
>> >>
>> >> worth
>> >>
>> >> the money if providing
>> accessible diagrams from PDF was its only
>> >>
>> >> use.
>> >>
>> >> IVEO
>> >>
>> >> does not require a tiger
>> printer, swell paper would work, other
>> >>
>> >> embossers
>> >>
>> >> may (the outputting from IVEO
>> is the question as I think it may
>> >>
>> >> only
>> >>
>> >> output
>> >>
>> >> to devices appearing as
>> standard printers). Interesting, the IVEO
>> >>
>> >> route
>> >>
>> >> again is requiring a human to
>> make the decision on what forms the
>> >>
>> >> diagram.
>> >>
>> >> Michael Whapples
>> >>
>> >> -----Original Message-----
>> From: Richard Baldwin
>> >> Sent: Friday, January 27, 2012
>> 3:28 PM
>> >> To: Jamal Mazrui
>> >> Cc: Blind Math list for those
>> interested in mathematics
>> >> Subject: Re: [Blindmath]
>> Extracting bitmap images from pdf files
>> >>
>> >>
>> >> Hi Jamal,
>> >>
>> >> It is a great program, easy to
>> use, and probably totally
>> >>
>> >> accessible. I
>> >>
>> >> particularly like the fact
>> that the program doesn't require a
>> >>
>> >> windows
>> >>
>> >> installation. The output data
>> is well organized and including the
>> >>
>> >> page
>> >>
>> >> numbers in the bmp file names
>> is a great help in analyzing them.
>> >>
>> >> Unfortunately, the output
>> produced by the program suffers from the
>> >>
>> >> same
>> >>
>> >> issues that I have encountered
>> with all of the other image
>> >>
>> >> extractor
>> >>
>> >> programs that I have tried. A
>> few of the images come out intact.
>> >>
>> >> Most
>> >>
>> >> of
>> >>
>> >> the images don't come out
>> intact.
>> >>
>> >> For example, page three of one
>> of the pdf files that I tested has
>> >>
>> >> a
>> >>
>> >> single
>> >> image of a battery. It is the
>> same image that I enhanced and
>> >>
>> >> posted
>> >>
>> >> in an
>> >>
>> >> earlier post. Your program
>> produced 54 bmp files for that page. A
>> >>
>> >> few
>> >>
>> >> of
>> >>
>> >> them were icons such as arrows
>> exclamation marks, etc. The
>> >>
>> >> remaining
>> >>
>> >> bmp
>> >>
>> >> files appear to be a very
>> small pieces of the image of the
>> >>
>> >> battery.
>> >> By
>> >>
>> >> the
>> >> way, I got the earlier image
>> of the battery by taking a screen
>> >>
>> >> shot
>> >> of
>> >>
>> >> the
>> >> page and using an image
>> editing program to crop out the battery
>> >>
>> >> image.
>> >>
>> >> None
>> >> of the image extraction
>> programs that I have tested extract the
>> >>
>> >> image
>> >>
>> >> intact.
>> >>
>> >> I don't know anything at all
>> about the internal structure of pdf
>> >>
>> >> files,
>> >>
>> >> and
>> >> this behavior of breaking an
>> image into many small pieces may
>> >>
>> >> depend
>> >>
>> >> on
>> >>
>> >> how
>> >> the file is constructed in the
>> first place. In any event, my
>> >>
>> >> immediate
>> >>
>> >> problem has to do with a
>> specific set of pdf files that are the
>> >>
>> >> chapters
>> >>
>> >> from a specific physics book,
>> so this program doesn't solve my
>> >>
>> >> problem.
>> >>
>> >> Thanks for offering the
>> program.
>> >> Dick Baldwin
>> >>
>> >> On Fri, Jan 27, 2012 at 5:18
>> AM, Jamal Mazrui<empower at smart.net>
>> >>
>> >> wrote:
>> >>
>> >> In an attempt to facilitate a
>> free, non-web dependent solution, I
>> >>
>> >> have
>> >>
>> >> written a Windows
>> console-mode utility called PDF2Images, built
>> >>
>> >> with
>> >>
>> >> PowerBASIC and a PDF
>> library. The distribution archive,
>> >>
>> >> including
>> >>
>> >> documentation and source
>> code, is available at
>> >>
>> >>
>> >> http://empowermentzone.com/********pdf2images.zip<http://empowermentzone.com/******pdf2images.zip>
>> <
>> >> http://empowermentzone.com/*
>> >> ***pdf2images.zip>
>> >> <http://**empowermentzone.com/******pdf2images.zip<http://empowermentzone.com/****pdf2images.zip>
>> <
>> >> http://empowermentzone.com/
>> >> **pdf2images.zip>
>> >> <http://**empowermentzone.com/******pdf2images.zip<http://empowermentzone.com/****pdf2images.zip>
>> <
>> >> http://empowermentzone.com/
>> >> **pdf2images.zip>
>> >> <http://**empowermentzone.com/****pdf2images.zip<http://empowermentzone.com/**pdf2images.zip>
>> <
>> >> http://empowermentzone.com/pd
>> >> f2images.zip>
>> >>
>> >>
>> >> I am interested in any
>> feedback on how well it works compared to
>> >>
>> >> other
>> >>
>> >> approaches.
>> >>
>> >> Jamal
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Richard G. Baldwin (Dick
>> Baldwin)
>> >> Home of Baldwin's on-line Java
>> Tutorials
>> >> http://www.DickBaldwin.com
>> >>
>> >> Professor of Computer
>> Information Technology
>> >> Austin Community College
>> >> (512) 223-4758
>> >> mailto:Baldwin at DickBaldwin.com
>> >>
>> >> http://www.austincc.edu/******baldwin/<http://www.austincc.edu/****baldwin/>
>> <http://www.austincc.**edu/**baldwin/<http://www.austincc.edu/**baldwin/>
>> >> <
>> >>
>> >> http://www.austincc.edu/****baldwin/<http://www.austincc.edu/**baldwin/>
>> <http://www.austincc.**edu/baldwin/ <http://www.austincc.edu/baldwin/>
>> >>
>> >> ______________________________
>> ******_________________
>> >> Blindmath mailing list
>> >> Blindmath at nfbnet.org
>> >>
>> >> http://nfbnet.org/mailman/******
>> listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/****listinfo/blindmath_nfbnet.org>
>> <
>> >> http://nfbnet.or
>> >> g/mailman/**listinfo/blindmath**_nfbnet.org<http://blindmath_nfbnet.org>
>> >
>> >> <**http://nfbnet.org/mailman/****listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/**listinfo/blindmath_nfbnet.org>
>> **<
>> >> http://nfbnet.o
>> >> rg/mailman/listinfo/blindmath_**nfbnet.org<http://blindmath_nfbnet.org>
>> >
>> >>
>> >> To unsubscribe, change your
>> list options or get your account info
>> >>
>> >> for
>> >>
>> >> Blindmath:
>> >>
>> >> http://nfbnet.org/mailman/******
>> options/blindmath_nfbnet.org/******<http://nfbnet.org/mailman/****options/blindmath_nfbnet.org/****>
>> <
>> >> http://nfbne
>> >> t.org/mailman/**options/**blindmath_nfbnet.org/**<http://t.org/mailman/**options/blindmath_nfbnet.org/**>
>> >
>> >>
>> >> mwhapples%40aim.com<http://**n
>> **fbnet.org/mailman/options/** <http://nfbnet.org/mailman/options/**>>>
>> >>
>> >> blindmath_nfbnet.org/****mwhapples%40aim.com<http://blindmath_nfbnet.org/**mwhapples%40aim.com>
>> <
>> >> http://nfbnet.org/mailman/**options<http://nfbnet.org/mailman/options>
>> >> /blindmath_nfbnet.org/**mwhapples%40aim.com<http://blindmath_nfbnet.org/mwhapples%40aim.com>
>> >
>> >>
>> >> ______________________________
>> ******_________________
>> >> Blindmath mailing list
>> >> Blindmath at nfbnet.org
>> >>
>> >> http://nfbnet.org/mailman/******
>> listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/****listinfo/blindmath_nfbnet.org>
>> <
>> >> http://nfbnet.or
>> >> g/mailman/**listinfo/blindmath**_nfbnet.org<http://blindmath_nfbnet.org>
>> >
>> >> <**http://nfbnet.org/mailman/****listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/**listinfo/blindmath_nfbnet.org>
>> **<
>> >> http://nfbnet.o
>> >> rg/mailman/listinfo/blindmath_**nfbnet.org<http://blindmath_nfbnet.org>
>> >
>> >>
>> >> To unsubscribe, change your
>> list options or get your account info
>> >>
>> >> for
>> >>
>> >> Blindmath:
>> >>
>> >> http://nfbnet.org/mailman/******
>> options/blindmath_nfbnet.org/******<http://nfbnet.org/mailman/****options/blindmath_nfbnet.org/****>
>> <
>> >> http://nfbne
>> >> t.org/mailman/**options/**blindmath_nfbnet.org/**<http://t.org/mailman/**options/blindmath_nfbnet.org/**>
>> >
>> >>
>> >> baldwin%40dickbaldwin.com<**
>> http**://nfbnet.org/mailman/**options/**<http://nfbnet.org/mailman/options/**>
>> >>
>> >> blindmath_nfbnet.org/baldwin%****40dickbaldwin.com<
>> >> http://nfbnet.org/mailman/o
>> >> ptions/blindmath_nfbnet.org/**
>> baldwin%40dickbaldwin.com<http://blindmath_nfbnet.org/baldwin%40dickbaldwin.com>
>> >
>> >>
>> >>
>> >>
>> >> --
>> >> Richard G. Baldwin (Dick Baldwin)
>> >> Home of Baldwin's on-line Java
>> Tutorials
>> >> http://www.DickBaldwin.com
>> >>
>> >> Professor of Computer Information
>> Technology
>> >> Austin Community College
>> >> (512) 223-4758
>> >> mailto:Baldwin at DickBaldwin.com
>> >> http://www.austincc.edu/****
>> baldwin/ <http://www.austincc.edu/**baldwin/>
>> >>
>> >> <http://www.austincc.edu/**baldwin/ <http://www.austincc.edu/baldwin/>
>> >
>> >>
>> >>
>> >>
>> >> --
>> >> Richard G. Baldwin (Dick Baldwin)
>> >> Home of Baldwin's on-line Java
>> Tutorials
>> >> http://www.DickBaldwin.com
>> >>
>> >> Professor of Computer Information
>> Technology
>> >> Austin Community College
>> >> (512) 223-4758
>> >> mailto:Baldwin at DickBaldwin.com
>> >> http://www.austincc.edu/****baldwin/<http://www.austincc.edu/**baldwin/>
>> <
>> >>
>> >> http://www.austincc.edu/**baldwin/<http://www.austincc.edu/baldwin/>
>> >
>> >>
>> >>
>> >>
>> >> --
>> >> Richard G. Baldwin (Dick Baldwin)
>> >> Home of Baldwin's on-line Java Tutorials
>> >> http://www.DickBaldwin.com
>> >>
>> >> Professor of Computer Information
>> Technology
>> >> Austin Community College
>> >> (512) 223-4758
>> >> mailto:Baldwin at DickBaldwin.com
>> >> http://www.austincc.edu/****baldwin/<http://www.austincc.edu/**baldwin/>
>> <http://www.austincc.**edu/baldwin/ <http://www.austincc.edu/baldwin/>>
>> >> ______________________________**
>> **_________________
>> >> Blindmath mailing list
>> >> Blindmath at nfbnet.org
>> >>
>> >> http://nfbnet.org/mailman/****
>> listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/**listinfo/blindmath_nfbnet.org>
>> <
>> >> http://nfbnet.org/
>> >> mailman/listinfo/blindmath_**nfbnet.org<http://blindmath_nfbnet.org>
>> >
>> >>
>> >> To unsubscribe, change your list options
>> or get your account info for
>> >> Blindmath:
>> >> http://nfbnet.org/mailman/****
>> options/blindmath_nfbnet.org/****<http://nfbnet.org/mailman/**options/blindmath_nfbnet.org/**>
>> >>
>> >> brh%40opticinspiration.org<
>> >> http://nfbnet.org/mailman/**options/blindmath_nfbne<http://nfbnet.org/mailman/options/blindmath_nfbne>
>> >> t.org/brh%40opticinspiration.**org<http://t.org/brh%40opticinspiration.org>
>> >
>> >>
>> >>
>> >> ______________________________**
>> **_________________
>> >> Blindmath mailing list
>> >> Blindmath at nfbnet.org
>> >>
>> >> http://nfbnet.org/mailman/****
>> listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/**listinfo/blindmath_nfbnet.org>
>> <
>> >> http://nfbnet.org/
>> >> mailman/listinfo/blindmath_**nfbnet.org<http://blindmath_nfbnet.org>
>> >
>> >>
>> >> To unsubscribe, change your list options or
>> get your account info for
>> >> Blindmath:
>> >> http://nfbnet.org/mailman/****
>> options/blindmath_nfbnet.org/****<http://nfbnet.org/mailman/**options/blindmath_nfbnet.org/**>
>> >>
>> >> baldwin%40dickbaldwin.com<
>> >> http://nfbnet.org/mailman/**options/blindmath_nfbnet<http://nfbnet.org/mailman/options/blindmath_nfbnet>
>> >> .org/baldwin%40dickbaldwin.com**>
>> >>
>> >>
>> >>
>> >> --
>> >> Richard G. Baldwin (Dick Baldwin)
>> >> Home of Baldwin's on-line Java Tutorials
>> >> http://www.DickBaldwin.com
>> >>
>> >> Professor of Computer Information Technology
>> >> Austin Community College
>> >> (512) 223-4758
>> >> mailto:Baldwin at DickBaldwin.com
>> >> http://www.austincc.edu/**baldwin/<http://www.austincc.edu/baldwin/>
>> >> ______________________________**_________________
>> >> Blindmath mailing list
>> >> Blindmath at nfbnet.org
>> >> http://nfbnet.org/mailman/**
>> listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/listinfo/blindmath_nfbnet.org>
>> >> To unsubscribe, change your list options or get
>> your account info
>> >> for Blindmath:
>> >>
>> >> http://nfbnet.org/mailman/**
>> options/blindmath_nfbnet.org/**brh%40opticinspirati<http://nfbnet.org/mailman/options/blindmath_nfbnet.org/brh%40opticinspirati>
>> >> on.org
>> >>
>> >>
>> >> ______________________________**_________________
>> >> Blindmath mailing list
>> >> Blindmath at nfbnet.org
>> >> http://nfbnet.org/mailman/**
>> listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/listinfo/blindmath_nfbnet.org>
>> >> To unsubscribe, change your list options or get your
>> account info for
>> >> Blindmath:
>> >> http://nfbnet.org/mailman/**
>> options/blindmath_nfbnet.org/**bente%40casilenc.com<http://nfbnet.org/mailman/options/blindmath_nfbnet.org/bente%40casilenc.com>
>> >>
>> >>
>> >> ______________________________**_________________
>> >> Blindmath mailing list
>> >> Blindmath at nfbnet.org
>> >> http://nfbnet.org/mailman/**
>> listinfo/blindmath_nfbnet.org<http://nfbnet.org/mailman/listinfo/blindmath_nfbnet.org>
>> >> To unsubscribe, change your list options or get your
>> account info for
>> >> Blindmath:
>> >>
>> >> http://nfbnet.org/mailman/**
>> options/blindmath_nfbnet.org/**baldwin%40dickbaldwin.com<http://nfbnet.org/mailman/options/blindmath_nfbnet.org/baldwin%40dickbaldwin.com>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Richard G. Baldwin (Dick Baldwin)
>> >> Home of Baldwin's on-line Java Tutorials
>> >> http://www.DickBaldwin.com
>> >>
>> >> Professor of Computer Information Technology
>> >> Austin Community College
>> >> (512) 223-4758
>> >> mailto:Baldwin at DickBaldwin.com
>> >> http://www.austincc.edu/**baldwin/<http://www.austincc.edu/baldwin/>
>> >
>> >
>> >
>> > --
>> > Richard G. Baldwin (Dick Baldwin)
>> > Home of Baldwin's on-line Java Tutorials
>> > http://www.DickBaldwin.com
>> >
>> > Professor of Computer Information Technology
>> > Austin Community College
>> > (512) 223-4758
>> > mailto:Baldwin at DickBaldwin.com
>> > http://www.austincc.edu/**baldwin/ <http://www.austincc.edu/baldwin/>
>>
>>
>
>
> --
> Richard G. Baldwin (Dick Baldwin)
> Home of Baldwin's on-line Java Tutorials
> http://www.DickBaldwin.com
>
> Professor of Computer Information Technology
> Austin Community College
> (512) 223-4758
> mailto:Baldwin at DickBaldwin.com
> http://www.austincc.edu/baldwin/
>
--
Richard G. Baldwin (Dick Baldwin)
Home of Baldwin's on-line Java Tutorials
http://www.DickBaldwin.com
Professor of Computer Information Technology
Austin Community College
(512) 223-4758
mailto:Baldwin at DickBaldwin.com
http://www.austincc.edu/baldwin/
More information about the BlindMath
mailing list