Database Concurrency Without a Web Server?

Don S. wrote:

Francis C. wrote:

Don, would it help if we added something like this to the
EventMachine…

You know how Python’s Twisted has something called a “deferred”?
Basically an operation you kick off and you supply a callback that gets

Guys, a huge “Thank You” for all the help!

Francis, if I understand everything correctly, I think that would do the
trick! The db client blocking certainly explains why 100 monitors per
minute runs like clockwork but 2000 introduces a bunch of delays.

Right now, the only call to the db that I make within EventMachine is to
save the record.

Here is how it looks:

EventMachine.add_timer(delay){job.run; job.save}

With the exception of job.save(), all other db activity happens before I
add the job to EventMachine. And I don’t think there is any blocking
code in job.run. It simply executes my network connectivity code and
sets the Job object’s attributes based on the results.

As far as instantiating a bunch of objects, I think that’s working just
fine. I can instantiate 1000 Job objects and add them to EventMachine
in less than a second. So I’m very happy there.

If I could get job.save to execute in a thread pool I think I’ll be
cooking with gas. And looking forward, I can think of several instances
where I may need to leverage blocking code. For example, I’m planning
on writing historical results to RRD (a Round Robin Database) which I
believe requires file system access. Having the thread pool available
when needed would be great.

Thanks again for all the help guys. You’re awesome! This beginner
would be lost without you.

  • Don

Ok, I implemented it. I’ll send you a new EventMachine gem to your
private email, and if it works for you, I’ll cut a new public release
(will be version 0.6.0).

In your particular case you have no code that has to process the result
of job.save, so your callback can be empty. Your new code would look
like this:

EventMachine.add_timer(delay){
job.run
EventMachine.defer proc {job.save}
}

If you had had a callback you wanted to execute after job.save
completes, you would have added it as a second proc argument to the
#defer call.

Francis C. wrote:

Ok, I implemented it. I’ll send you a new EventMachine gem to your
private email, and if it works for you, I’ll cut a new public release
(will be version 0.6.0).

In your particular case you have no code that has to process the result
of job.save, so your callback can be empty. Your new code would look
like this:

EventMachine.add_timer(delay){
job.run
EventMachine.defer proc {job.save}
}

If you had had a callback you wanted to execute after job.save
completes, you would have added it as a second proc argument to the
#defer call.

Man, that was fast! Overnight delivery to the west coast! :wink: I’ll give
her a test drive later this morning and let you know how it goes.

Thanks!

-Don

On 7/13/06, Don S. [email protected] wrote:

trick! The db client blocking certainly explains why 100 monitors per
add the job to EventMachine. And I don’t think there is any blocking
code in job.run. It simply executes my network connectivity code and
sets the Job object’s attributes based on the results.

As far as instantiating a bunch of objects, I think that’s working just
fine. I can instantiate 1000 Job objects and add them to EventMachine
in less than a second. So I’m very happy there.

If I could get job.save to execute in a thread pool I think I’ll be
cooking with gas.

I’m not sure introducing concurrency will boost your insert rate all
that much.

I whipped up a small test to check, but even with
AR::Base.allow_concurrency set to true it seems my db access is still
serialized. 10 connections are opened all right, but 9 are idle
whenever i check db stats.

Any thoughts? - Source at http://pastebin.ca/87122

Isak

Ezra Z. wrote:

Hi !

On Jul 13, 2006, at 5:17 AM, Francis C. wrote:

Here is how it looks:
just
believe requires file system access. Having the thread pool available
(will be version 0.6.0).

If you had had a callback you wanted to execute after job.save
completes, you would have added it as a second proc argument to the
#defer call.

Francis-

This looks sweet. Please do let me know when you release the version
with the defered procs.

Thanks
-Ezra

Ezra, watch the Ruby-talk ML, I always announce EM releases there. If
you think it’s appropriate, I’ll [ANN] on this list as well.

Hi !

On Jul 13, 2006, at 5:17 AM, Francis C. wrote:

Here is how it looks:
just
believe requires file system access. Having the thread pool available
(will be version 0.6.0).

If you had had a callback you wanted to execute after job.save
completes, you would have added it as a second proc argument to the
#defer call.

Francis-

This looks sweet. Please do let me know when you release the version

with the defered procs.

Thanks
-Ezra

On Jul 13, 2006, at 10:15 AM, Isak H. wrote:

I’m not sure introducing concurrency will boost your insert rate
all that much.

I whipped up a small test to check, but even with
AR::Base.allow_concurrency set to true it seems my db access is still
serialized. 10 connections are opened all right, but 9 are idle
whenever i check db stats.

Any thoughts? - Source at http://pastebin.ca/87122

The postgres and oracle adapters on edge rails use nonblocking query
methods when allow_concurrency is set. Could you rake
rails:freeze:edge and try with postgres?

jeremy

On 7/13/06, Francis C. [email protected] wrote:

someone is using it for network connections which are already
nonblocking, but also needs database or file access. They can use
threads for the stuff that isn’t yet nonblocking.

Good to see an event framework started for ruby, it definitely needed
one.

EventMachine is definitely intended to meet the needs addressed by
Twisted. What other features do you think are high on the list? Also,
can you think of any applications that would be easier to write with
such a Ruby framework?

Maybe something like twisted has where you can specify two methods to
a callback, one for success and one if there is an error so you can
catch it.

snacktime wrote:

On 7/13/06, Francis C. [email protected] wrote:

someone is using it for network connections which are already
nonblocking, but also needs database or file access. They can use
threads for the stuff that isn’t yet nonblocking.

Good to see an event framework started for ruby, it definitely needed
one.

EventMachine is definitely intended to meet the needs addressed by
Twisted. What other features do you think are high on the list? Also,
can you think of any applications that would be easier to write with
such a Ruby framework?

Maybe something like twisted has where you can specify two methods to
a callback, one for success and one if there is an error so you can
catch it.

I’ve thought a lot about that, and I’m not convinced it’s necessary
because Ruby makes certain things easier than Python. What Twisted does
when it creates a deferred is give you back an object that you add your
own callbacks and errbacks to, in a chain. And you can pass it around to
other code that will add its own callbacks and errbacks. But with Ruby,
all the processing is there in the closure. EventMachine#defer as
currently implemented doesn’t return anything. It simply remembers the
callback block as the operation makes its way through the thread pool.
Whatever result was generated from the processing block gets passed to
the (single) callback, which can make its own decisions about whether an
error occurred or not.

I may be missing something, so can you think of a case where it actually
makes more sense to supply separate callbacks and errbacks?

Jeremy K. wrote:

The postgres and oracle adapters on edge rails use nonblocking query
methods when allow_concurrency is set. Could you rake
rails:freeze:edge and try with postgres?

jeremy

Intersting… maybe I’ll give Postgres and EdgeRails a spin. If I can
get the db update queries to run in a non-blocking manner it’ll probably
solve my problem.

  • Don

On 7/13/06, Jeremy K. [email protected] wrote:

The postgres and oracle adapters on edge rails use nonblocking query
methods when allow_concurrency is set. Could you rake
rails:freeze:edge and try with postgres?

Was indeed my adapter not being up to the task. After switching from
postgres-pr to a native postgres adapter, the connections got a whole
lot busier.

Might as well post my results, now that I got the thing to run.

I wouldn’t take this benchmark too seriously, but unless there are any
blatant bugs in my code, it should give some indication on what to
expect… (development mode shouldn’t cause too much overhead?)

Numbers below are time taken to insert 20k rows.

Creating and saving the rows sequentially, each in a separate
transaction:
ar_basic: 122.281

Creating and saving the rows sequentially, split across transactions
of 2k rows. (bit large, but couldn’t be arsed to split into groups of
100)
ar_trans: 60.141

Using connection.execute(), transaction size of 2k rows
noar_trans: 21.141

10 concurrent threads creating 2000 rows each. One transaction per
insert.
ar_concurrent: 193.812

Conclusion: concurrency bad :wink:

Isak

Conclusion: concurrency bad :wink:

Isak

How about update performance. I’m not inserting any rows in my
application. Only updating existing rows.

  • Don

On 7/14/06, Don S. [email protected] wrote:

Conclusion: concurrency bad :wink:

Isak

How about update performance. I’m not inserting any rows in my
application. Only updating existing rows.

OK, as I already have the project here.

Updating 2k rows:

Load row, alter a value, save it.
ar_basic: 20.86

Same as above, but grouped in transactions of 100 rows.
ar_trans: 14.891

Load row just for the sake of it, then update using
connection.execute(). 100 rows/transaction.
noar_trans: 7.109

10 threads updating 200 rows each.
ar_concurrent: 30.703

Isak

On 7/13/06, Francis C. [email protected] wrote:

Twisted. What other features do you think are high on the list? Also,
own callbacks and errbacks to, in a chain. And you can pass it around to
other code that will add its own callbacks and errbacks. But with Ruby,
all the processing is there in the closure. EventMachine#defer as
currently implemented doesn’t return anything. It simply remembers the
callback block as the operation makes its way through the thread pool.
Whatever result was generated from the processing block gets passed to
the (single) callback, which can make its own decisions about whether an
error occurred or not.

I may be missing something, so can you think of a case where it actually
makes more sense to supply separate callbacks and errbacks?

Ok I should probably take some time to use it before making
suggestions:) I just looked over the code and at the short examples.