Database Concurrency Without a Web Server?

My Rails application requires a very busy worker process running in the
background at all times. I am launching this non-interactive process
using script/runner. This process is very busy and is communicating
with the MySQL server constantly. As I started putting some load on the
system to test, I started running into performance issues.

I have since discovered that Rails is only using 1 database connnection
in this context. When monitoring the database, MySQL reports only 1
connection and 1 thread, even during very busy times. It doesn’t look
like Rails is using database connection pooling when ActiveRecord is
being used in this context (stand alone).

The big question: How can I get database connection pooling when
running a non-interactive Rails application via script/runner?

Without this I may be dead in the water.

Thanks!

  • Don

I could REALLY use a hand with this one. I’m completely stuck until I
find a way to get my background process to use more than 1 database
connection.

Any help would be greatly appreciated.

Thanks!

  • Don

Don S. wrote:

Anybody?

On Jul 12, 2006, at 8:55 AM, Don S. wrote:

Don S. wrote:

Anybody?


Posted via http://www.ruby-forum.com/.


Rails mailing list
[email protected]
http://lists.rubyonrails.org/mailman/listinfo/rails

Don-

Have a look at my BackgrounDRb plugin[1]. It aims to solve the

background worker task problem. Its a multi threaded enviropnent that
runs with ActiveRecord::Base.allow_concurrency=true so you should get
more performance out of it.

Cheers-
-Ezra

[1]
http://backgroundrb.rubyforge.org/
http://brainspl.at/articles/tag/background

Don S. wrote:

Is there no way to get script/runner to use db connection pooling?

Actually you gave me a hint Ezra. I’ll try

ActiveRecord::Base.allow_concurrency = true

and see what results I get.

  • Don

Ezra Z. wrote:

Have a look at my BackgrounDRb plugin[1]. It aims to solve the
background worker task problem. Its a multi threaded enviropnent that
runs with ActiveRecord::Base.allow_concurrency=true so you should get
more performance out of it.

Cheers-
-Ezra

Thanks Ezra. I’ll have a peek. But I would like to use what I have
without introducing anything extra. I don’t think that my background
service needs DRb functionality. It is working fine now with the
exception that it’s only using one db connection. That’s causing a
major bottleneck.

Is there no way to get script/runner to use db connection pooling?

  • Don

Ezra Z. wrote:

Be very carefull with that setting. You do not want your main rails
app running with that setting sewt to true or you will have major
problems. When you use ActiveRecord by itself then you can

Thanks for the heads up, Ezra! I gave it a test drive and so far I
don’t see any difference.

I’m working on a network monitoring system. The background process is a
Poller service that is used to monitor remote systems. Each Poller
object has a Job object in a one-to-one relationship. The poller is
quite busy running many simultaneous jobs. So I need a high level of
concurrency.

In my tests, the script is instanciating 5000 ActiveRecord Job objects
per minute to poll remote systems / services. I am using Francis
Cianfrocca’s EventMachine[1] to invoke the Job.run() method of each
object at a specific time within a 1 minute window. After the job runs
it saves the results to the db (Job.save). I’m trying to run 5000 of
these Jobs per minute. Each Job only takes about 5-7 milliseconds to
comlete.

It just doesn’t look like script/runner is using db connection pooling.
And I suspect that the db queries are blocking, causing a serious
bottleneck. One reason I think this is system CPU utilization is only
around 72% and MySQL isn’t even breathing hard.

I really need to figure out what’s causing the delays. Many of my jobs
are running very late and I think it’s because my db access is being
serialized.

Any suggestions would be greatly appreciated!

  • Don

[1] http://rubyforge.org/projects/eventmachine/

On Jul 12, 2006, at 11:57 AM, Don S. wrote:

  • Don

Don-

Be very carefull with that setting. You do not want your main rails

app running with that setting sewt to true or you will have major
problems. When you use ActiveRecord by itself then you can
allow_concurrency. But if you try to do that in a rails app the whole
combination of AR action controller and action view is not thread
safe and you will not want to run things that way. So you should make
your script run with that setting but not your rails app.

Cheers-
-Ezra

On 7/11/06, Don S. [email protected] wrote:

being used in this context (stand alone).

The big question: How can I get database connection pooling when
running a non-interactive Rails application via script/runner?

Without this I may be dead in the water.

Thanks!

  • Don

You’re sure (lack of) connection pooling is where your problem is
really at? Keep in mind that one connection is all you’ll ever need
for a single process.

Isak

On 7/12/06, Don S. [email protected] wrote:

Poller service that is used to monitor remote systems. Each Poller
comlete.
Any suggestions would be greatly appreciated!
If you are using non blocking IO, then everything needs to be non
blocking. For example if you use an event framework of some type to
fire off network connections via net/http, it’s not going to work,
because the IO in net/http blocks. Same for connection pooling. You
really need to use an event framework that has handlers for everything
that is IO based, or just use threads. I’ve used python’s twisted
framework in the past, and it works very well.

Chris

A better way to explain it might be this. Connection pooling won’t
help you unless the database driver is itself non blocking. The
moment that one of your Job.run methods makes a call to the database,
or does any other type of blocking IO, it’s going to block every other
Job.run method you have running. That’s why you only see one database
connection open at a time.

Isak H. wrote:

You’re sure (lack of) connection pooling is where your problem is
really at? Keep in mind that one connection is all you’ll ever need
for a single process.

Isak

Nope. I’m not sure. I’m just trying to rule it out. I don’t know
where the trouble is.

When I saw that the MySQL server only had one client connection I got
suspicious. I just assumed that Rails’ connection pooling meant that
multiple connections would be used if necessary. Maybe I’m wrong or
maybe I’m getting confused in the terminology. Maybe the word
“connection” is being used in a different way between the MySQL and
Rails worlds.

MySQL shows only one client connection with one associated thread. Is
that all that’s needed for Rails connection pooling? Will Rails and the
MySQL client (on Linux) make mulitple / simultaneous / non-blocking db
queries through one MySQL connection when in the script/runner context?
And how does ActiveRecord::Base.allow_concurrency = true affect this
behaviour?

Thanks a ton for the help! I’ve been working on this for days!

  • Don

Don S. wrote:

that all that’s needed for Rails connection pooling? Will Rails and the
MySQL client (on Linux) make mulitple / simultaneous / non-blocking db
queries through one MySQL connection when in the script/runner context?

OK. I think I answered that question. Using MySQL Administrator (the
GUI tool) I can see in the Connection Health monitor there are times
that more than one query is running at a time.

So if those queries are not blocking then I can rule out the MySQL
client as the cause of the work slow downs.

Does ActiveRecord::Base.allow_concurrency = true have any affect on how
Rails uses the MySQL client? Or is this more application framework
specific?

  • Don

On 7/12/06, Don S. [email protected] wrote:

So if those queries are not blocking then I can rule out the MySQL
client as the cause of the work slow downs.

Yes it does block. You need to use a framework that has a non
blocking connection pool mechanism built into it.

On Jul 12, 2006, at 3:01 PM, Don S. wrote:

GUI tool) I can see in the Connection Health monitor there are times

  • Don
    Don-
AR::Base.allow_concurrency just allows multi threaded access to the

database. It won’t do you any good with a single threaded non-
blocking setup like EventMachine. So I don’t think that setting is
going to help you. I think for how many db hits you are looking to do
per second that maybe the creation of the ruby AR objects is slowing
you down more then anything esle. Have you tried using the mysql-ruby
adapter and hitting the db without going through ActiveRecord?

Cheers-
-Ezra

Don, would it help if we added something like this to the
EventMachine…

You know how Python’s Twisted has something called a “deferred”?
Basically an operation you kick off and you supply a callback that gets
called when the operation completes.

What if EventMachine had this:

op = proc {

You initiate a database operation here…

}
callback = proc {|result|

You process the result here.

}

EventMachine.schedule_operation( op, proc )

The call to #schedule_operation would probably have an options parameter
so you could specify concurrency levels, thread-safety, etc. The
operation block would be scheduled into a thread pool maintained by the
event-machine, but the result block would be called on the main
event-machine thread as an event like any other.

Comments?

On 7/12/06, Francis C. [email protected] wrote:

You initiate a database operation here…

event-machine, but the result block would be called on the main
event-machine thread as an event like any other.

Comments?

I think a thread pool for running code that will block is a great idea
while you are still working on all the different io handlers. Say
someone is using it for network connections which are already
nonblocking, but also needs database or file access. They can use
threads for the stuff that isn’t yet nonblocking.

Good to see an event framework started for ruby, it definitely needed
one.

Francis C. wrote:

Don, would it help if we added something like this to the
EventMachine…

You know how Python’s Twisted has something called a “deferred”?
Basically an operation you kick off and you supply a callback that gets

Guys, a huge “Thank You” for all the help!

Francis, if I understand everything correctly, I think that would do the
trick! The db client blocking certainly explains why 100 monitors per
minute runs like clockwork but 2000 introduces a bunch of delays.

Right now, the only call to the db that I make within EventMachine is to
save the record.

Here is how it looks:

EventMachine.add_timer(delay){job.run; job.save}

With the exception of job.save(), all other db activity happens before I
add the job to EventMachine. And I don’t think there is any blocking
code in job.run. It simply executes my network connectivity code and
sets the Job object’s attributes based on the results.

As far as instantiating a bunch of objects, I think that’s working just
fine. I can instantiate 1000 Job objects and add them to EventMachine
in less than a second. So I’m very happy there.

If I could get job.save to execute in a thread pool I think I’ll be
cooking with gas. And looking forward, I can think of several instances
where I may need to leverage blocking code. For example, I’m planning
on writing historical results to RRD (a Round Robin Database) which I
believe requires file system access. Having the thread pool available
when needed would be great.

Thanks again for all the help guys. You’re awesome! This beginner
would be lost without you.

  • Don

Hi, Francis, finally see some twisted fans in ruby world;-)
I always wonder why there isn’t a similar thing in ruby,
it will be too cool.
I think it makes a great google SoC project.
Francis C. wrote:

Don, would it help if we added something like this to the
EventMachine…

You know how Python’s Twisted has something called a “deferred”?
Basically an operation you kick off and you supply a callback that gets
called when the operation completes.

What if EventMachine had this:

op = proc {

You initiate a database operation here…

}
callback = proc {|result|

You process the result here.

}

EventMachine.schedule_operation( op, proc )

The call to #schedule_operation would probably have an options parameter
so you could specify concurrency levels, thread-safety, etc. The
operation block would be scheduled into a thread pool maintained by the
event-machine, but the result block would be called on the main
event-machine thread as an event like any other.

Comments?

snacktime wrote:

On 7/12/06, Francis C. [email protected] wrote:

You initiate a database operation here…

event-machine, but the result block would be called on the main
event-machine thread as an event like any other.

Comments?

I think a thread pool for running code that will block is a great idea
while you are still working on all the different io handlers. Say
someone is using it for network connections which are already
nonblocking, but also needs database or file access. They can use
threads for the stuff that isn’t yet nonblocking.

Good to see an event framework started for ruby, it definitely needed
one.

EventMachine is definitely intended to meet the needs addressed by
Twisted. What other features do you think are high on the list? Also,
can you think of any applications that would be easier to write with
such a Ruby framework?

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs