Create unique process to handle transactions

Hi,

I’m looking for a way to avoid race conditions on complicated
transactions. Some transactions I do need to update multiple values in
multiple tables, and create new objects.

It looks like the safest way to do this is to setup a single process
that would just implement these transactions and have the rest of the
application call it for services.

Where can I find some information on how to set it up?

  • Create the transaction process
  • Communicate to it from the application processes (mongrel cluster)
  • Guarantee that this process is up after reboot

Thanks a million,
Amir

helzer wrote:

I’m looking for a way to avoid race conditions on complicated
transactions. Some transactions I do need to update multiple values in
multiple tables, and create new objects.

It looks like the safest way to do this is to setup a single process
that would just implement these transactions and have the rest of the
application call it for services.

Where can I find some information on how to set it up?

  • Create the transaction process
  • Communicate to it from the application processes (mongrel cluster)
  • Guarantee that this process is up after reboot

You can run it either through a UNIX cron (or Windows “at”) or
BackgrounDRb [1]. In both cases you’ll want to store state about its
execution somewhere external (i.e. the database) so that if the server
is rebooted during processing, it’ll restart work when the server goes
online again.

I’ve found a combination of BackgrounDRb workers (in your case: a single
one) and a seperate BackgrounDRb “maintenance worker” that runs on a
schedule to be a great match. You can communicate with the workers from
within Rails using BackgrounDRb’s MiddleMan.

[1] http://backgroundrb.devjavu.com/


Roderick van Domburg

Hi Roderick,

Do you think that creating many middlemen in a rapid pace has a big
performance impact?

Thanks,
Amir

If you are using a single database, you could ensure that all database
activity related to an action occurs atomically by using ActiveRecord
transactions. See
http://wiki.rubyonrails.org/rails/pages/HowToUseTransactions

Nick, that does seem to be more what I need to use. When I use a
transaction, does that mean nothing else gets inserted in the middle to
the
database?

I’m changing account, balance, creating account lines and creating
invoices.
Each, goes to a different table in the (same) database. If I just wrap
it
with a transaction, will that imply that all the read and write actions
within the transaction happen without any other writes to the database
by
other Rails instances?

Thanks,
Amir

Amir Helzer wrote the following on 26.07.2007 20:29 :

Nick, that does seem to be more what I need to use. When I use a
transaction, does that mean nothing else gets inserted in the middle
to the database?

I’m changing account, balance, creating account lines and creating
invoices. Each, goes to a different table in the (same) database. If I
just wrap it with a transaction, will that imply that all the read and
write actions within the transaction happen without any other writes
to the database by other Rails instances?

No. You must use serialized transactions for that, ActiveRecord doesn’t
use them by default. Even with them you only get transactions semantics:
either all of it succeeds (no incompatible concurrent writes) or it
rollbacks all changes (and you are responsible to do it all again in the
new context).

Lionel

Amir Helzer wrote:

Nick, that does seem to be more what I need to use. When I use a
transaction, does that mean nothing else gets inserted in the middle to
the database?

I’m changing account, balance, creating account lines and creating
invoices.
Each, goes to a different table in the (same) database. If I just wrap
it with a transaction, will that imply that all the read and write actions
within the transaction happen without any other writes to the database
by other Rails instances?

Sorry for misunderstanding – this is exactly the use case where you’d
want to use transactions. Transactions guarantee the robustness that you
seek by locking the database. It’s controlled on the database, not by
Rails, so it’s guaranteed that other Rails instances won’t “interfere”.

If you’re worried about performance and have the rapid pace of
transactions that you say you have, then I can recommend to pick up a
book about database management and weigh your options. Many simultaneous
database transactions can congest the database, in which case you might
still be better off with a batch run like the one I was referring to.
But really, that’d be a pretty complex scenario you’d be running –
don’t worry about it too much until you hit that wall.


Roderick van Domburg

Lionel B. wrote:

Amir Helzer wrote the following on 26.07.2007 20:29 :

I’m changing account, balance, creating account lines and creating
invoices. Each, goes to a different table in the (same) database. If I
just wrap it with a transaction, will that imply that all the read and
write actions within the transaction happen without any other writes
to the database by other Rails instances?

No. You must use serialized transactions for that, ActiveRecord doesn’t
use them by default. Even with them you only get transactions semantics:
either all of it succeeds (no incompatible concurrent writes) or it
rollbacks all changes (and you are responsible to do it all again in the
new context).

That’s a truly interesting point you raise.

It should be easy to patch this into ActiveRecord. All it needs is a
little #supports_serializable_transactions? and
#begin_db_transaction(*attr) love. I’ll add it to my todo list.

Of course, it seems that you could always do an
ActiveRecord::Base.execute(database_dependent_serializable_start_statement)
yourself as a quick hack to start a serializable transaction.

What do you think? Any experience with this?


Roderick van Domburg

The problem is far more complex than that.

First, here’s a gift:

This library extends the ActiveRecord::Base class

to provide transactions with SERIALIZABLE isolation

level for PostgreSQL

class ActiveRecord::Base
# This is a transaction in serialized mode
def self.serialized_transaction(*objects, &block)
if self.connection.is_a?
ActiveRecord::ConnectionAdapters::PostgreSQLAdapter
serial_sql = ‘SET TRANSACTION ISOLATION LEVEL SERIALIZABLE’
# pass a modified block to the original transaction method
self.transaction(*objects) {
# Serialize on this connection
connection.execute(serial_sql)
objects.each { |o|
# Serialize on each object connection if any
o.connection.execute(serial_sql)
}
block.call
}
else
self.transaction(*objects) {
block.call
}
end
end
end

The problem is that it is far from enough, you either:

  • have to lock the affected rows in the transaction as documented in
    ActiveRecord::Base.find (in this case serialized statements aren’t
    needed, but you’ll have to make sure that all affected rows are properly
    locked not only in the method you want to protect but in every other
    possible access to the same data
    ) or,
  • abstract these modifications and make sure you only issue statements
    that can be serialized by the database
    each time you touch the data.

For example:

Person.transaction do
person = Person.find(1, :lock => true)
person.visits += 1
person.save!
end

is OK in itself, but if you have another Person(1) instance elsewhere
which is fetched before this transaction and saved after, the lock
doesn’t matter, visits will be reset to its previous value.

Using serializable transactions give better performance and less room
for errors:

class Person
def self.inc_visits(person_id)
serialized_transaction do
update_all(‘visits = visits + 1’, [ ‘id = ?’, person_id ])
end
end
end

The good thing with this code is that concurrent updates are
supported, the right amount of visits will be stored at the end of all
concurrent Person.inc_visits without explicit locking. You can update
different parts of the same row without the database even having to hand
you explicit locks (and the DB can optimize it).

But the same problem remains, if a separate instance of Person for the
affected row is created and saved at the wrong moments, you are screwed.

My advice would be to separate data that must be updated like that in a
separate model and only use serializable transactions (or locks if you
DB doesn’t support this type of transactions) on each and every access
to this model.
If you don’t, you’ll have to remember that all updates to a model where
you had to lock somewhere in your code must use locks too (:lock => true
will then show up everywhere in your code…) or serializable
transactions…

Lionel.

On Jul 26, 6:41 pm, snacktime [email protected] wrote:

Some more information on your schema/model might help us give you some
more pointers on the simplest way to accomplish what you are after.

My case is like this:

  • Different users have accounts. Each account has a balance.
  • There’s a table for transactions. A transaction has an amount,
    from_account_id and to_account_id.
  • A separate table holds account_lines. For each transactions, there’s
    an account line per account, showing the balance after the
    transaction.
  • A final table is for invoices. Invoice has_many transactions. When
    an invoice is paid, all transactions are executed, balance updates and
    account lines are created.

Some transactions are created by user requests and some, by the
payment gateway notifying of payment clearance.
There are some accounts that will hardly update concurrently (because
they belong to users), while other accounts (like system account that
hold fees) will update very frequently.

That’s it, more or less.

As I understand it, a separate process for executing transactions
would guarantee completely that things are always coherent. Overall,
money movements in the system should be a small part of all DB queries
(maybe 1%).

What do you think?

Amir

I’ll throw a couple more things in here.

  • If you use serialized transactions you also have to be prepared to
    deal with conflicts and redo transactions.

  • You should and probably can find a way to design your model so that
    you only need to serialize access to a single table at the most. I
    really can’t think of a well designed schema off the top of my head
    that requires rows in different tables to all be inserted together in
    isolation. SELECT FOR UPDATE is another nice tool, at least in
    postgresql.

And you also need to know how your database works. Postgresql, and
Oracle also from what I’ve heard, use multi version concurrency
control as opposed to explicit locking, which can change how you
structure your queries and transactions depending on what database you
use.

Some more information on your schema/model might help us give you some
more pointers on the simplest way to accomplish what you are after.

Chris

Thanks for your thoughts and code Lionel!


Roderick van Domburg

Hi again,

I’m trying to use transactions, but it doesn’t seem like anything is
actually happening towards MySQL.
This is my code:

logger.info " ---------- BEFORE: #{from_account.balance},
#{to_account.balance}"
logger.info " --------------- TRANSACTION BEGIN "
begin
MoneyAccount.transaction do
from_account.update_attributes(:balance => (from_account.balance -
amount))
some_error
to_account.update_attributes(:balance => (to_account.balance +
amount))
end
rescue
from_account.reload
to_account.reload
end
logger.info " --------------- TRANSACTION END "
logger.info " ---------- AFTER: #{from_account.balance},
#{to_account.balance}"


On the development log, I’m not seeing any transaction related
instruction, just:

---------- BEFORE: 1000.0, 0.0
--------------- TRANSACTION BEGIN
[4;35;1mUserAccount Update (0.000391) [0m [0mUPDATE
money_accounts SET currency_id = 1, owner_id = 2, balance =
‘972.65’, type = ‘UserAccount’ WHERE id = 1 [0m
[4;36;1mUserAccount Load (0.000330) [0m [0;1mSELECT * FROM
money_accounts WHERE (money_accounts.id = 1) AND
( (money_accounts.type = ‘UserAccount’ ) ) [0m
[4;35;1mBidAccount Load (0.000231) [0m [0mSELECT * FROM
money_accounts WHERE (money_accounts.id = 58) AND
( (money_accounts.type = ‘BidAccount’ ) ) [0m
--------------- TRANSACTION END
---------- AFTER: 972.65, 0.0

My database type is MySQL-5 and the tables are all InnoDB.

I think that this should all be wrapped by BEGIN and COMMIT, but
they’re not there, and the updates are not committed together.

Thanks,
Amir

On 7/27/07, helzer [email protected] wrote:

from_account_id and to_account_id.
they belong to users), while other accounts (like system account that
hold fees) will update very frequently.

That’s it, more or less.

As I understand it, a separate process for executing transactions
would guarantee completely that things are always coherent. Overall,
money movements in the system should be a small part of all DB queries
(maybe 1%).

What do you think?

If it’s just the accounts table that has the balance column, then
using select for update might be enough. Really depends on the
details of how you are updating everything. Here is a good place to
start:

http://developer.postgresql.org/pgdocs/postgres/transaction-iso.html

helzer wrote the following on 30.07.2007 16:53 :

Oops. It’s only like that in test mode. Development and production
show the transactions.

Is there anything I can set so that I can have it in test too?

Transactional fixtures sometimes interfere with transactions (they wrap
each test in its own transaction, which can actually make your own
transactions noops).

In your test classes you may want to try:
self.use_transactional_fixtures = false

You can put it in Test::Unit::TestCase in test/test_helper.rb if you
want it for all your test classes.

Lionel.

Thanks Lionel, Nick, Roderick and everyone who helped here.

I’ve started with an idea to make it overly complicated and now, I
think I got it right, so let me summarize:

  • I use transaction blocks to make sure that any failure happens on
    both the withdrawals and deposits (so money doesn’t vanish into thin
    air).
  • To make sure that concurrent operations from different processes
    don’t spoil data, I use Optimistic Locking (Agile web development with
    Rails, page 389).

The block of DB operations is wrapped in begin-rescue. In the rescue
section, I reload all written-to objects. This is a MUST if you want
to retry the operation, because, otherwise the optimistic locking will
just keep failing (to refresh the lock_version from the updated
database).

The application retries failed operations.

This is how it all looks like in my implementation now:

while !ok && (attempts < MAX)

begin
Account.transaction
move_money()
setup_stuff()
ok = true
end
rescue
reload_accounts()
attempt += 1
end

Amir

Oops. It’s only like that in test mode. Development and production
show the transactions.

Is there anything I can set so that I can have it in test too?

Amir