How to validate uniqueness across multiple columns in ActiveRecord?

Hi,

From a model class, what is the standard way of ensuring uniqueness
across multiple columns (attributes) simultaneously?

In other words, I want to ensure only unique “rows” are persisted to
storage. I feel like there must be a straightforward way to do this.
Any suggestions?

Thanks,

Grar

On Feb 22, 8:52 pm, Grary [email protected] wrote:

Hi,

From a model class, what is the standard way of ensuring uniqueness
across multiple columns (attributes) simultaneously?

In other words, I want to ensure only unique “rows” are persisted to
storage. I feel like there must be a straightforward way to do this.
Any suggestions?

validate_uniqueness’ scope option does this (but you should really be
using a unique index as well.)

Fred

Thanks Fred.

What do you mean by ‘use a unique index’?

What I am concerned about is logical uniqueness, i.e., persist only
data objects with a unique combination of attributes.

Thanks,

Grar

On Feb 22, 3:54 pm, Frederick C. [email protected]

Rick,

Thanks. That sounds exactly like what I’m after…

Grary

On 23 February 2010 01:30, Rick DeNatale [email protected]
wrote:

resolve it.
If you add a column called “combined_params” or similar, you can add a
before_validate method to concatenate all of your parameters and write
it to this column, this will let you do AR “validates_uniqueness_of”
checks too.

It would actually remove the need to have the key added across the
columns in the DB, but I’d leave that as a safety net, as it would be
too easy to accidentally (or deliberately) change a row’s
“combined_params” column, and risk duplicates.

Another fine idea, it would appear. But why do both? Does checking a
combined parameters column before validation save me some kind of
performance hit?

Grar

On 23 February 2010 20:25, Grary [email protected] wrote:

Another fine idea, it would appear. But why do both? Does checking a
combined parameters column before validation save me some kind of
performance hit?

Yes it does a little… but you have to weigh up what hits are going
to happen (performance or otherwise) if non-unique rows get in the
table.
If your job is on the line, then having an index across the fields is
bullet proof, and processors are cheap… (ish)

Being able to check “combined parameters” is just gravy - an added
bonus that’ll make the Rails coding simpler, but the index is what
making sure.

On Mon, Feb 22, 2010 at 4:54 PM, Grary [email protected] wrote:

Thanks Fred.

What do you mean by ‘use a unique index’?

What I am concerned about is logical uniqueness, i.e., persist only
data objects with a unique combination of attributes.

So if I understand correctly, lets say there are two attributes a, and b

You want to make sure that no two models have the same COMBINATION of
values for a and b,

so having two models with:

a = 1, b = 2
a = 1, b = 3

would not be a conflict

If that’s the case then the standard validation

class Widget < ActiveRecord::Base
validates_uniqueness_of :a, :b
end

wouldn’t work since it tries to prevent saving two models with the
same value of a, OR with the same value of b

And even if that’s not what you’re trying to do, and you’re ok with
the example being a conflict, Fred’s point is that
validates_uniqueness_of doesn’t guarantee uniqueness if two users try
to save conflicting records simultaneously. The validation works by
first trying to find a record with the value, and if it doesn’t find
it inserting the ‘new’ record, and this can fail due to a concurrency
hole.

To fill this hole requires leaning on the database server, and the way
to do that in SQL is by having a unique index on the table which
covers the column or columns you want to be unique. This assume you
are using a database which supports it, e.g. MySql.

To create an index you can create a migration which includes a statement
like

add_index :widgets, [:a, :b], :unique => true)

Assuming that the table name for the model is ‘widgets’

Now if you do this, you also need to be aware that if you try to save
a record with a uniqueness conflict the save will raise an
ActiveRecord::StatementInvalid exception, which you’ll need to rescue
and do something like telling the user of the conflict so that he can
resolve it.

Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: Rick DeNatale - Developer - IBM | LinkedIn

@Robert - Yes, I read that portion of the documentation and thanks for
the reminder.

Otherwise, of potential interest to discussants…

I now have a migration adding a unique index to a model, like so:

add_index :projects, [:name, :street_address, :city, :state, :zip],
:unique => true

where only some subset of the attributes – [name, zip], [name, city,
state], [street_address, city, state], etc. – are required in the
Project model.

In testing, only when all the properties named in my add_index method
are passed is the exception properly thrown.

For example:

assert_raise ActiveRecord::StatementInvalid do
  exceptionable_proj = Project.new
  exceptionable_proj.city = city
  exceptionable_proj.street_address = street_address
  exceptionable_proj.state = state
  exceptionable_proj.name = name
  exceptionable_proj.zip = zip
  exceptionable_proj.save

  exceptionable_proj2 = Project.new
  exceptionable_proj2.city = city
  exceptionable_proj2.street_address = street_address
  exceptionable_proj2.state = state
  exceptionable_proj2.name = name
  exceptionable_proj2.zip = zip
  exceptionable_proj2.save
end

The above passes, whereas the test below does not:

assert_raise ActiveRecord::StatementInvalid do
  exceptionable_proj3 = Project.new
  exceptionable_proj3.street_address = street_address2
  exceptionable_proj3.zip = zip2
  exceptionable_proj3.save

  exceptionable_proj4 = Project.new
  exceptionable_proj4.street_address = street_address2
  exceptionable_proj4.zip = zip2
  exceptionable_proj4.save
end

Grar

Just to be crystal clear, what a number of these replies are attempting
to tell you is that you cannot rely on validates_uniqueness_of.

Excerpt from the Rails docs on validation:

Concurrency and integrity

Using this validation method in conjunction with ActiveRecord::Base#save
does not guarantee the absence of duplicate record insertions, because
uniqueness checks on the application level are inherently prone to race
conditions. For example, suppose that two users try to post a Comment at
the same time, and a Comment’s title must be unique. At the
database-level, the actions performed by these users could be
interleaved in the following manner:

If two separate requests are received at virtually the same instant then
validate_uniqueness_of can fail silently. You will end up with
duplicates in the database that your validation specifically tries to
avoid.

This is why the unique index across the two columns is necessary. This
can only be reliably prevented at the database level. If this index
exists and the validates_uniqueness_of fails due to a race condition the
database layer will raise and exception. So you need to be prepared for
this, even when using validates_uniqueness_of, by rescuing from the
possible exception.