Hello,
I’m looking for libraries to do text extraction from MS Office and PDF
file formats. Also looking for libraries to do HTML rendering of
documents in the same formats. I know of couple of commercial
libraries from Oracle and Autonomy, but they only have C and/or Java
APIs. I also found this project POI Ruby Bindings.
Is there other open source alternatives, and/or alternatives with Ruby
bindings?
Thanks,
Vitali