I just joined the group and I want to ask something about my problem.
I’m still learning Ruby on Rails and now I have a task to parse
Microsoft Word and store the content into database.
Do you have any suggestion how to do it?
FYI, I develop it under Unix Environment. So, I don’t have a chance to
use win32ole on it, CMIIW.
I also have searched the internet about this. But all I found that I
need to use JRuby and combine it with Apache POI or else I need to use
win32ole. As far as I know, to use JRuby I need to create the rails
project also with JRuby but unfortunately I already created the
project with plain Ruby.
So, I don’t know what to do anymore. Does anybody have clue?
On Mar 16, 2011, at 2:51 PM, Hafiz Badrie Lubis wrote:
I also have searched the internet about this. But all I found that I
need to use JRuby and combine it with Apache POI or else I need to use
win32ole. As far as I know, to use JRuby I need to create the rails
project also with JRuby but unfortunately I already created the
project with plain Ruby.
So, I don’t know what to do anymore. Does anybody have clue?
I did a project in PHP quite a few years ago, and I used some
venerable unix cli converters to do this. I stored the files as is,
and then used these converters to rip out their text and stored that
in the database for searching. They aren’t perfect, but they do a good
enough job for search results.
$translators = array(
‘pdf’ => ‘/usr/local/bin/pdftotext ./pdf/%s.pdf -’,
‘ppt’ => ‘/usr/local/bin/catppt -d ascii ./ppt/%s.ppt’,
‘xls’ => ‘/usr/local/bin/xls2csv -d ascii ./xls/%s.xls’,
‘doc’ => ‘/usr/local/bin/catdoc -d ascii ./doc/%s.doc’
); //these translators all pipe to stdout, which means that shell_exec
will return their text value
On Mar 16, 2011, at 8:06 PM, Hafiz Badrie Lubis wrote:
To make a collaboration between a rails project with JRuby codes.
It has nothing whatsoever to do with JRuby. You can run Java apps from
Ruby exactly like any other command-line process. I don’t know if POI is
just a library, or has a full app utility as well. If it’s just a lib,
you’d have to write the program, probably a half-dozen lines of Java.