Validation, concurrency, and transactions

Michael_McGreevy · August 23, 2006, 10:44pm

Hi,

Something has been bothering me about model validation: how do I know
that
the database has not changed in between when I validate my model’s data,
and
when it actually gets saved to the database? As a simple example, say I
have
a User model like:

class User
validates_uniqueness_of :username
end

When this gets saved to the database, the validation code checks that
there is
not already a row containing the same value for username as in our
object.
The object is then saved to the database and everyone is happy. But
what if
there is another user trying to register with the same username at the
same
time? I am assuming that with load balancing situations is it possible
that
two rails processes could try to save a User object at the same time.
So
what happens if the User object in each process checks that its username
is
unique, and then each User object does an insert into the database. Is
the
uniqueness constraint not violated?

One option would be to have a database-level uniqueness constraint, but
I’ve
heard that this is not the “rails way”, and in any case, this is just a
simple example. In general the validation code could be arbitrarily
complex.

The other option that I can see is to use a transaction around the save
operation, and define an after_save callback that re-validates the data
in
the database. If (for the username example above) there is more than
one
identical username, the after_save callback could throw an exception
which
(as I understand it) would back out the transaction. Although if this
is
done, perhaps it is not worth validating the model before saving it in
the
first place, since we would catch any errors in after_save.

Is this the way that this problem is usually handled? Or am I seeing a
problem where one does not really exist?

Thanks,

Michael McGreevy.

Michael_McGreevy · August 23, 2006, 11:04pm

Michael McGreevy wrote:

Hi,

Something has been bothering me about model validation: how do I know
that
the database has not changed in between when I validate my model’s data,
and
when it actually gets saved to the database? As a simple example, say I
have
a User model like:

class User
validates_uniqueness_of :username
end

When this gets saved to the database, the validation code checks that
there is
not already a row containing the same value for username as in our
object.
The object is then saved to the database and everyone is happy. But
what if
there is another user trying to register with the same username at the
same
time? I am assuming that with load balancing situations is it possible
that
two rails processes could try to save a User object at the same time.
So
what happens if the User object in each process checks that its username
is
unique, and then each User object does an insert into the database. Is
the
uniqueness constraint not violated?

One option would be to have a database-level uniqueness constraint, but
I’ve
heard that this is not the “rails way”, and in any case, this is just a
simple example. In general the validation code could be arbitrarily
complex.

The other option that I can see is to use a transaction around the save
operation, and define an after_save callback that re-validates the data
in
the database. If (for the username example above) there is more than
one
identical username, the after_save callback could throw an exception
which
(as I understand it) would back out the transaction. Although if this
is
done, perhaps it is not worth validating the model before saving it in
the
first place, since we would catch any errors in after_save.

Is this the way that this problem is usually handled? Or am I seeing a
problem where one does not really exist?

Thanks,

Michael McGreevy.

Michael

As you rightly pointed out validate_uniqueness_of is not very reliable,
I would agree with you that you can catch the exception in after_save.
But that is not enough reason to remove validate_uniqueness_of
completely, because I think behaves like a first line of defense (though
not primary).

Let me know if you come across anyother ideas.

-daya

Michael_McGreevy · August 23, 2006, 11:04pm

not already a row containing the same value for username as in our object.
simple example. In general the validation code could be arbitrarily complex.
problem where one does not really exist?
I think I’d put a unique constraint in the database and be done with it.
Treat the problem at the source so to speak…

Rails shouldn’t care if it’s there… it will fail on save and you can
deal with it at that point…

It’s kind of like javascript form validation… it’s nice, but you also
need to do it on the server if you care about it at all…

2 cents.

-philip

Michael_McGreevy · August 23, 2006, 11:08pm

On 8/23/06, Michael McGreevy [email protected]
wrote:

same
I’ve
first place, since we would catch any errors in after_save.

Is this the way that this problem is usually handled? Or am I seeing a
problem where one does not really exist?

This is a problem with an validation which fetches data before the save.
Without a lock or serializing transactions, the validation is not a
guarantee. Use a database constraint as a fallback.

jeremy

Michael_McGreevy · August 23, 2006, 11:40pm

On 8/23/06, Daya S. [email protected] wrote:

As you rightly pointed out validate_uniqueness_of is not very reliable,
I would agree with you that you can catch the exception in after_save.
But that is not enough reason to remove validate_uniqueness_of
completely, because I think behaves like a first line of defense (though
not primary).

Let me know if you come across anyother ideas.

If more databases had coherent, parseable error messages, we could catch
the
database exception and populate record.errors. Otherwise we could
establish
a naming convention for database constraints so we can deduce which
validation failed. I tried these with PostgreSQL but was dissatisfied
with
its error reporting and naming conventions are a pain without migrations
support to do it for you.

jeremy

Michael_McGreevy · August 23, 2006, 11:30pm

On Wednesday 23 August 2006 22:52, Philip H. wrote:

Something has been bothering me about model validation: how do I know
that the database has not changed in between when I validate my model’s
data, and when it actually gets saved to the database?

…

One option would be to have a database-level uniqueness constraint, but
I’ve heard that this is not the “rails way”, and in any case, this is
just a simple example. In general the validation code could be
arbitrarily complex.

I think I’d put a unique constraint in the database and be done with it.
Treat the problem at the source so to speak…

Rails shouldn’t care if it’s there… it will fail on save and you can
deal with it at that point…

That’s true, but as I mentioned, I have other cases where I need to do
more
complex validations, which would require the use of stored procedures (I
think – I’ve never written one) if I wanted to validate at the database
level, and I think this would be highly undesirable.

I suppose the problem is that there are two types of validation, which
I’ll
call “local” and “global”. Local validation doesn’t care about any data
which is already stored in the database (e.g. validates_length_of,
validates_presence_of), and can safely be handled in isolation by
validating
before saving. Global validation is concerned with how the data in a
model
fits in with the existing data in the database (e.g. a custom validation
that
ensures that person is not an ancestor of himself) and cannot be handled
in
isolation. All the examples of Rails code that I have seen, however,
treat
“global” validation problems as “local” ones, at least to the extent
that
they don’t check the integrity of the database after the update.
Admittedly
I haven’t seen that much Rails code. So if anyone has a “best practice”
counter example, I’d love to see it.

Michael McGreevy.