Forum: Ruby on Rails Fast Bulk Inserts?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
20381b12339f505894d6cff38d4696ca?d=identicon&s=25 Steve Sloan (Guest)
on 2005-12-15 20:34
(Received via mailing list)
I need to insert ~100k rows into the database, but using
"Table.new(...).save"
is painfully slow (I notice in the logs that it's wrapping each insert
in a
transaction -- that can't be fast).

Any suggestions on how I can improve the speed of bulk inserts like
this?

Thanks ...

-- Steve
42172acdf3c6046f84d644cb0b94642c?d=identicon&s=25 Pat Maddox (pergesu)
on 2005-12-15 20:37
(Received via mailing list)
I don't know if this is the good option, but based on the fact that it
wraps each insert in a transaction, you could just start a
transaction, insert all the rows, and then commit.  Is that going to
be significantly faster?  I'm not sure.

Pat
D61a2db96e73acf66b0b18688a39ecfb?d=identicon&s=25 Obie Fernandez (Guest)
on 2005-12-15 20:43
(Received via mailing list)
On 12/15/05, Pat Maddox <pergesu@gmail.com> wrote:
> I don't know if this is the good option, but based on the fact that it
> wraps each insert in a transaction, you could just start a
> transaction, insert all the rows, and then commit.  Is that going to
> be significantly faster?  I'm not sure.
>
> Pat

+1 It will be significantly faster in my experience. I used a REXML
and ActiveRecord ruby script to do a mass import of XML data into one
of my app's tables back in June. Not necessarily the fastest
performance, but quick and easy to write. Probably faster now since AR
has had some performance optimizations since then.
20381b12339f505894d6cff38d4696ca?d=identicon&s=25 Steve Sloan (Guest)
on 2005-12-15 23:49
(Received via mailing list)
Pat Maddox wrote:
> I don't know if this is the good option, but based on the fact that it
> wraps each insert in a transaction, you could just start a
> transaction, insert all the rows, and then commit.  Is that going to
> be significantly faster?  I'm not sure.

Yup, had the same thought, and after some benchmarking, wrapping the
whole thing
in a single transaction gives about a %22 speed increase.  However, I'm
still
only (?) getting about 186 rows/sec insert velocity on MySQL/InnoDB.
Any other
ideas?

Also, I'm wondering if this method will work on DBs that don't support
transactions, i.e. MySQL/MyISAM.  Or are we not bothering to support
those silly
people? ;)

-- Steve
3a83969376c805ef5b6042191fdb0ff3?d=identicon&s=25 Andreas S. (andreas)
on 2005-12-16 00:05
Steve Sloan wrote:
> Pat Maddox wrote:
>> I don't know if this is the good option, but based on the fact that it
>> wraps each insert in a transaction, you could just start a
>> transaction, insert all the rows, and then commit.  Is that going to
>> be significantly faster?  I'm not sure.
>
> Yup, had the same thought, and after some benchmarking, wrapping the
> whole thing
> in a single transaction gives about a %22 speed increase.  However, I'm
> still
> only (?) getting about 186 rows/sec insert velocity on MySQL/InnoDB.
> Any other
> ideas?

Try disabling indexes.
20381b12339f505894d6cff38d4696ca?d=identicon&s=25 Steve Sloan (Guest)
on 2005-12-16 00:17
(Received via mailing list)
Andreas S. wrote:
>> However, I'm stillonly (?) getting about 186 rows/sec insert velocity on MySQL/InnoDB.
>> Any other ideas?
>
> Try disabling indexes.

Uh, how?  Short of dropping/adding them, that is.

-- Steve
3a83969376c805ef5b6042191fdb0ff3?d=identicon&s=25 Andreas S. (andreas)
on 2005-12-16 00:36
Steve Sloan wrote:
> Andreas S. wrote:
>>> However, I'm stillonly (?) getting about 186 rows/sec insert velocity on MySQL/InnoDB.
>>> Any other ideas?
>>
>> Try disabling indexes.
>
> Uh, how?  Short of dropping/adding them, that is.

MySQL has a function for it. I can't tell you off-hand how it's called,
you have to look it up in the manual.
D2b2017098da3e2fd1cdcfbb72ecdbfb?d=identicon&s=25 Jason Edgecombe (Guest)
on 2005-12-16 00:56
(Received via mailing list)
Steve Sloan wrote:

> Andreas S. wrote:
>
>>> However, I'm stillonly (?) getting about 186 rows/sec insert
>>> velocity on MySQL/InnoDB. Any other ideas?
>>
>>
>> Try disabling indexes.
>
>
> Uh, how?  Short of dropping/adding them, that is.

drop, then adding indexes is correct. That's the only way I know how on
mysql and postgresql.

Jason
F48118fe74b0c7f6fd82a0ee422fa34e?d=identicon&s=25 snacktime (Guest)
on 2005-12-16 01:14
(Received via mailing list)
On 12/15/05, Steve Sloan <steve@finagle.org> wrote:
> I need to insert ~100k rows into the database, but using "Table.new(...).save"
> is painfully slow (I notice in the logs that it's wrapping each insert in a
> transaction -- that can't be fast).
>
> Any suggestions on how I can improve the speed of bulk inserts like this?

Is using the native database tools an option, or do you need to use
ActiveRecord?  Both Postresql and Mysql have fast methods for bulk
loading data.

Chris
B9a732fc30c32098347a0177c75ee27b?d=identicon&s=25 Jeroen Houben (Guest)
on 2005-12-16 11:01
(Received via mailing list)
snacktime wrote:
> ActiveRecord?  Both Postresql and Mysql have fast methods for bulk
> loading data.

Does anybody know if ActiveRecord automatically takes advantage of
Postgresql prepared statements?

http://www.postgresql.org/docs/8.1/interactive/sql...

Jeroen
1b83454abeeb627eb42abc214ebbae00?d=identicon&s=25 Nicholas Seckar (Guest)
on 2005-12-17 06:47
(Received via mailing list)
On 15-Dec-05, at 2:33 PM, Steve Sloan wrote:

> I need to insert ~100k rows into the database, but using "Table.new
> (...).save" is painfully slow (I notice in the logs that it's
> wrapping each insert in a transaction -- that can't be fast).
>
> Any suggestions on how I can improve the speed of bulk inserts like
> this?

It's pretty easy to add a class method #batch_insert to your AR
class, or to all classes. If you are using mysql, the syntax is
VALUES (...) (..) etc.

I suggest saving a few thousand records every statement; the string
for 100k records would get pretty big. :-)
20381b12339f505894d6cff38d4696ca?d=identicon&s=25 Steve Sloan (Guest)
on 2005-12-17 13:42
(Received via mailing list)
Jason Edgecombe wrote:
> Steve Sloan wrote:
>> Andreas S. wrote:
 >>>
>>> Try disabling indexes.
 >
> drop, then adding indexes is correct. That's the only way I know how on
> mysql and postgresql.

Dropping the indexes seems to have a negligable impact (5:23 instead of
5:29),
but that's not too surprising, as the the table only has 3 columns (and
the
index is on two of them).

Any other ideas?  E.g., is there any way to insert multiple rows with a
single
SQL statement?  I know MySQL will do it, but PgSQL won't, and I don't
think it's
ANSI anyway.

-- Steve
01d7a451018ac15518f425078ef00c40?d=identicon&s=25 Phillip Hutchings (Guest)
on 2005-12-17 13:42
(Received via mailing list)
On 16/12/2005, at 12:51 PM, Steve Sloan wrote:

> Any other ideas?  E.g., is there any way to insert multiple rows
> with a single SQL statement?  I know MySQL will do it, but PgSQL
> won't, and I don't think it's ANSI anyway.

PostgreSQL has a handy COPY command which imports TSV or CSV data,
either from stdin or a file:
COPY my_table(column1, column2) FROM STDIN WITH CSV;
"thing","stuff"
"foo","bar"

etc.

See http://www.postgresql.org/docs/8.1/interactive/sql-copy.html for
more details


--
Phillip Hutchings
phillip.hutchings@sitharus.com
F0223b1193ecc3a935ce41a1edd72e42?d=identicon&s=25 Zach Dennis (Guest)
on 2005-12-17 13:42
(Received via mailing list)
Steve Sloan wrote:
>  >
> single SQL statement?  I know MySQL will do it, but PgSQL won't, and I
> don't think it's ANSI anyway.
>

Are you still using TableModel.new for each record? If so, this is
incredibly slow.

If you are using MySQL contact me offlist if you want my active record
extension (which only works for mysql right now) which increases the
speed 100x when inserting large sets of data into mysql.

Zach
This topic is locked and can not be replied to.