Forum: Ruby on Rails Importing / Parsing Large Excel Files ?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
AN@S (Guest)
on 2008-10-16 23:03
(Received via mailing list)
Hello,

I'm running into a project where a client has large Excel files
(60.000+ records per file) and he needs an application to import them
into a database to use this data for useful operations (reporting,
calculations .. etc).

I know about the available libraries:

http://raa.ruby-lang.org/project/parseexcel/
http://rubyforge.org/projects/spreadsheet/
http://rubyforge.org/projects/roo/

I've used parseexcel before but for small files. The point is that
parseexcel and spreadhsheet libraries have a reputation that they
can't handle large excel files (I don't know about 'roo').
The guy here (http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-
talk/223230) created some kind of a bridge between Ruby and Java to
use 'JavaExcelAPI' and it did work fine for him!

Has anyone tried importing large excel files with ruby? Do I really
need to use a Java library through a bridge to get my job done?
Personally I prefer to hire a Java developer to do that Excel
importing module than doing a bridge and correct me if I'm wrong.

What do you suggest guys? Is there some great ruby library to do the
job that I never heard of before? Or do you suggest some tweaks like
parsing the Excel file using 'parseexcel' like 500 records or
something at a time?

I'm kinda lost and I need your help

Regards
Roy P. (Guest)
on 2008-10-16 23:46
(Received via mailing list)
Other options include ruby's WIN32OLE and DBI libraries.  For the
former, you'll have to be running on a machine w/excel installed. You'd
use the latter on a windows machine to read the spreadsheets in via
ODBC.  Your connect string would be *something* like:

connection_string = "dbi:ADO:" +
                    "Provider=MSDASQL;" +
                    "Persist Security Info=False;" +
                    "Extended
Properties=\"dbq=c:/path/to/spreadsheet.xls\";"

HTH,

-Roy
AN@S (Guest)
on 2008-10-17 13:08
(Received via mailing list)
I'm running a Linux machine, so I think WIN2OLE and DBI libraries are
not an option.
Jean-Marc (M2i3.com) (Guest)
on 2008-10-21 16:04
(Received via mailing list)
Actually being on Linux does not prevent WIN2OLE and DBI if you use
WINE....


You still need to install all the supporting software (in this case
Excel).
Thomas P. (Guest)
on 2008-12-10 16:45
(Received via mailing list)
Hello,

On 16 Okt., 20:02, "AN@S" <removed_email_address@domain.invalid> wrote:
> Hello,
>
> I'm running into a project where a client has large Excel files
> (60.000+ records per file) and he needs an application to import them
> into a database to use this data for useful operations (reporting,
> calculations .. etc).
>
> I know about the available libraries:
>
> 
http://raa.ruby-lang.org/project/parseexcel/http:/...

I'm the author of the roo gem. Roo can handle such huge spreadsheet
files without problems but it may be very slow. You can try if it is
fast enough for your purposes.

-Thomas
Iain A. (Guest)
on 2008-12-10 18:57
(Received via mailing list)
If you decide to use roo you can try running the processing task in
the background.

backgroundRB ??
This topic is locked and can not be replied to.