Mutithreading to implement near 7000 to 10000 mssage per min

Hello,

we want to process 7000 to 10000 message and store it to the Database
perminute. we are trying through threads. can you please help me how we
do?

Regards
Kaja Mohaidee.A
Trichy

On Thu, 2008-08-14 at 14:22 +0900, Kaja M. wrote:

we want to process 7000 to 10000 message and store it to the Database
perminute. we are trying through threads. can you please help me how we
do?

Use a quick language that’s designed for massively parallel operations,
not one of the slower scripting languages without proper support for
parallelism.

I’m not trying to be glib here (although I may well be succeeding
anyway). What you are saying here looks an awful lot to me like “I want
to hammer in this screw here. Which wrench is best for the job?”
Select your tools appropriately for the problem. Don’t try to reforge
your wrenches into bizarre combination screwdriver-hammers.

2008/8/14 Michael T. Richter [email protected]:

On Thu, 2008-08-14 at 14:22 +0900, Kaja M. wrote:

we want to process 7000 to 10000 message and store it to the Database
perminute. we are trying through threads. can you please help me how we
do?

Use a quick language that’s designed for massively parallel operations, not
one of the slower scripting languages without proper support for
parallelism.

There may be a few ways to get this working with Ruby:

  1. use Ruby to create a CSV file or similar which is then loaded into
    the database via the database’s bulk loader.

  2. if there is support for batch operations in DBD/DBI these might work
    as well.

Since this task is mostly IO bound (unless of course there are
expensive operations to be done on those messages), Ruby threads are
not too unlikely to work in this scenario.

I’m not trying to be glib here (although I may well be succeeding anyway).
What you are saying here looks an awful lot to me like “I want to hammer in
this screw here. Which wrench is best for the job?” Select your tools
appropriately for the problem. Don’t try to reforge your wrenches into
bizarre combination screwdriver-hammers.

Well, maybe Kaja is just experimenting and trying out to which limits
he / she can push Ruby. :slight_smile:

Kind regards

robert

On Aug 14, 2008, at 6:28 AM, M. Edward (Ed) Borasky wrote:

support for parallelism.
could write some Ruby code to extract the messages from Mnesia to a
“regular” RDBMS if needed.

I’d say use rabbitmq with the ruby amqp library, this will allow you
to easily push many thousands of messages/sec into the rabbitmq
message bus and then you can have a set of fanout workers consuming
the queues on the other side and putting the items in the database.

-Ezra

On Thu, 2008-08-14 at 16:52 +0900, Michael T. Richter wrote:

anyway). What you are saying here looks an awful lot to me like “I
want to hammer in this screw here. Which wrench is best for the job?”
Select your tools appropriately for the problem. Don’t try to reforge
your wrenches into bizarre combination screwdriver-hammers.

Yes … Erlang / Mnesia should be able to handle this, and then you
could write some Ruby code to extract the messages from Mnesia to a
“regular” RDBMS if needed.

M. Edward (Ed) Borasky
ruby-perspectives.blogspot.com

“A mathematician is a machine for turning coffee into theorems.” –
Alfréd Rényi via Paul Erdős

On Thu, Aug 14, 2008 at 12:06 PM, ara.t.howard [email protected]
wrote:

buffer them and insert them in a transaction 1000 at a time. even with ruby
this should be a peice of cake.

Do any of the ruby db libraries offer support for doing this
efficiently?

martin

On Aug 13, 2008, at 11:22 PM, Kaja M. wrote:

Hello,

we want to process 7000 to 10000 message and store it to the Database
perminute. we are trying through threads. can you please help me how
we
do?

Regards
Kaja Mohaidee.A
Trichy

buffer them and insert them in a transaction 1000 at a time. even
with ruby this should be a peice of cake.

a @ http://codeforpeople.com/

On Aug 14, 2008, at 1:10 PM, Martin DeMello wrote:

martin
pretty much all of them

cfp:~/rails_root > cat a.rb
size = Integer(ARGV.shift || 10_000)

messages = Array.new(size).map{ rand.to_s }

Db = “#{ RAILS_ROOT }/db/#{ RAILS_ENV }.sqlite3”

using sqlite directly

Message.delete_all
sql = messages.map{|message| “insert into messages(content)
values(#{ message.inspect });”}.join(“\n”)

a = b = response = nil
IO.popen(“sqlite3 #{ Db } 2>&1”, “r+”) do |sqlite3|
a = Time.now.to_f
sqlite3.puts “begin;”
sqlite3.puts sql
sqlite3.puts “end;”
sqlite3.flush
sqlite3.close_write
response = sqlite3.read
b = Time.now.to_f
end

abort response unless $?.exitstatus.zero?

puts “using sqlite3”
puts “elapsed: #{ b - a }”
puts “count: #{ Message.count }”

using ar

Message.delete_all
a = Time.now.to_f

Message.transaction do
messages.each{|message| Message.create! :content => message}
end

b = Time.now.to_f

puts “using ar”
puts “elapsed: #{ b - a }”
puts “count: #{ Message.count }”

cfp:~/rails_root > ./script/runner a.rb
using sqlite3
elapsed: 0.222311019897461
count: 10000

using ar
elapsed: 7.75591206550598
count: 10000

0.2 seconds for 100000 records seems plenty fast to me. 7 seconds not
so much.

a @ http://codeforpeople.com/

Hi, I’ve found that this examples run batch messages insertion in a
transaction,
so this achieves quite impressive performance, but if I put every
message save
into its own transaction, ie, switch message.each do and db.transaction
do
messages.each do |m|
db.transaction do |db_in_trans|
db_in_trans.execute(“insert into messages(content) values( ‘#{m}’
)”)
end
end

ruby speed-test.rb 1
0.09 seconds to insert 1 records at 10.75 records per second

ruby speed-test.rb 10
1.20 seconds to insert 10 records at 8.31 records per second

ruby speed-test.rb 100
11.22 seconds to insert 100 records at 8.91 records per second

ruby speed-test.rb 1000
132.27 seconds to insert 1000 records at 7.56 records per second
the performance goes down pretty badly, goes down to several records per
second.

On Fri, Aug 15, 2008 at 5:25 AM, Jeremy

On Fri, Aug 15, 2008 at 05:56:46AM +0900, ara.t.howard wrote:

martin

pretty much all of them

[…]

much.
If your standard of performance is 10,000 records inserted in a minute,
any
database should be able to satisfy your requirements.

And here’s the amalgalite version of ara’s test… embedded sqlite in a
ruby
extension.

% cat am_inserts.rb
#!/usr/bin/env ruby
require ‘rubygems’
require ‘amalgalite’

size = Integer(ARGV.shift || 10_000)

messages = Array.new(size).map{ rand.to_s }

Db = “speed-test.db”

FileUtils.rm_f Db if File.exist?( Db )
db = Amalgalite::Database.new( Db )
db.execute(" CREATE TABLE messages(content); ")

before = Time.now.to_f
db.transaction do |db_in_trans|
messages.each do |m|
db_in_trans.execute(“insert into messages(content) values( #{m}
)”)
end
end
after = Time.now.to_f
elapsed = after - before
mps = size / elapsed
puts “#{”%0.2f" % elapsed} seconds to insert #{size} records at
#{"%0.2f" % mps} records per second"

% ruby am_inserts.rb
0.38 seconds to insert 10000 records at 25999.01 records per second

% ruby am_inserts.rb 100000
3.80 seconds to insert 100000 records at 26344.71 records per second

enjoy,

-jeremy