About duplication in a (HABTM) join table

I’m new to rails and databases, needless to say I’m pretty confused
here. There’s this issue I don’t understand how to resolve, heck I
don’t even know if it’s an issue. I’ve also been searching for answers
here, forums, irc, and nothing. Anyway I’ll try to be as clear as I
can be. And I apologize for the many questions in one thread.

Let’s say I have two tables, girls and boys (to spicy up this topic),
with their respective models, which in turn have a
has_and_belongs_to_many relationship. For that relationship to work, I
have a join table called boys_girls, with two columns, boy_id and
girl_id.

in rails, I create a boy called brad, and a girl called angelina.
now if I do:
brad.girls << angelina
brad.girls << angelina

not only will angelina be two times in his array (if only this stuff
could be in real life), but that relationship will appear in two rows
on the join table.

First question: as far as database performance and size goes, is this
a problem?

Anyway, if I add uniq => true to has_and_belongs_to_many on the
models, ActiveRecord will successfully ignore this duplication when I
reload the objects. But it will still act the same way as I said
before: if I add duplication, it will show up in the existing array
and it will be added to the table.

So, my second question is, how do I avoid this duplication?

I found in the agile web dev book that I can add an index to the join
table right in the migration, and add :unique => true after the
add_index call. I have tried it, and no difference. I suppose it only
configures the index to ignore duplicates, which would then resolve
performance issues?

Also, I have read that validates_uniqueness_of accepts various columns
with scope, but I’m not sure how to do that, and also, am I right to
say that scope only helps to limit uniqueness to given sets of rows?
In that case that wouldn’t help, right?

And, in case validation in the model is the way to go, where should I
put it? In the Girl and Boy models? or should I create a model to
represent the join table rows and do that validation there?

(Boys and girls might not exactly illustrate what I need. In my case,
I really don’t need to add more information to the join table, so
validation would possibly be the only reason to create a third model.)

I resolved the issue with the last option, creating a join model
(let’s say I called it Bond), and a has_many through call within both
Girl and Boy. Then all I had to do was add this to the join model:

validates_uniqueness_of :boy_id, :scope => :girl_id

now everything works perfectly, and the models prevent from saving
duplicates, returning a validation error if I try to save the same
item twice on the same collection.

about issues of having duplicate items on a join table, in db terms, I
talked to a much more db experienced person than me, which said that
first it’s messy, and second it could become a problem with a very
large user base (thousands of users with thousands of items…). So I
guess the right way is to avoid duplication, specially when deploying
for a large userbase, but as an internal tool for few users, it’s not
a terrible issue.

Thanks for posting your solution. I was getting into habtm/through and
I
hadn’t seen the obvious nature using through with validations.

This could qualify as the first installment of a Rails Quiz. Want to
start
it?

midwaltz wrote:

item twice on the same collection.


View this message in context:
http://www.nabble.com/about-duplication-in-a-(HABTM)-join-table-tf4315183.html#a12311094
Sent from the RubyOnRails Users mailing list archive at Nabble.com.

hi peter. a rails quiz? :slight_smile: go ahead, use it…

although further on I won’t be of much help, I’m might have a lot of
questions, but no answers. I’m really new to rails and backend. Which
is also a good reason not to take what I say for granted.

oh well. leaving tomorrow for holidays. only sand and trees. no join
tables and validations there.