Reading from a PDF

paragadosm · December 18, 2009, 8:04pm

I’m looking for a way to read text off of a PDF document in Windows.
The catch is that this code will get packaged as an ocra executable and
be run from a pc that does not have Ruby installed. I’ve google
searched and everything I read talks about needing to modify something
within the /Windows/System32 folder. I won’t be able to make this
modification on the pcs that this will be run from (unless ocra does
something with it when it packages).

Are there any alternatives other than the PDFToolkit? I’m assuming the
toolkit will not work unless you make those /System32 changes?

paragadosm · December 18, 2009, 8:21pm

On Fri, Dec 18, 2009 at 2:04 PM, Max P.
[email protected] wrote:

Are there any alternatives other than the PDFToolkit? I’m assuming the
toolkit will not work unless you make those /System32 changes?

PDF::Reader is pure ruby but quite low-level.

paragadosm · December 18, 2009, 8:27pm

Gregory B. wrote:

On Fri, Dec 18, 2009 at 2:04 PM, Max P.
[email protected] wrote:

Are there any alternatives other than the PDFToolkit? ï¿½I’m assuming the
toolkit will not work unless you make those /System32 changes?

PDF::Reader is pure ruby but quite low-level.

GitHub - yob/pdf-reader: The PDF::Reader library implements a PDF parser conforming as much as possible to the PDF specification from Adobe.

Thanks for the reply. This looks like what I’m looking for. Man, it’s
amazing the difference in results you’ll get in google searching for
“reading from a pdf in ruby” than “pdf reader in ruby” :o)

thanks again!

paragadosm · September 8, 2010, 9:39pm

Hey Max P. (chad locke)

did you ever get this to work for you?

-bobsmyph

paragadosm · September 8, 2010, 10:18pm

http://www.darknet.org.uk/2009/10/origami-parse-analyze-forge-pdf-documents/

origami is a Ruby framework designed to parse, analyze, and forge PDF
documents. This is NOT a PDF rendering library. It aims at providing a
scripting
toolhttp://www.darknet.org.uk/2009/10/origami-parse-analyze-forge-pdf-documents/#
to
generate and analyze malicious PDF files. As well, it can be used to
create
on-the-fly customized PDFs, or to inject (evil) code into already
existing
documents.