How should I select rows from a join-model based on more tha

A concrete example:

We’re building a system to organize the information about workshops
being held in various conventions. A convention has more than one
workshop, and each workshop can be held in more than one convention.
Also each workshop has a host, who is specific to a workshop being
held in a specific convention.

For example, Matz may host an “Introduction to Ruby” workshop in
RubyConf. And _why may host a the same workshop in RailsConf. This
operates through the join-model convention_workshops, which has the
following rows: convention_id, workshop_id, host_id, start_time. It
belongs_to :convention, :workshop, and :host.

Associations make it very easy to select by one association. If I want
to get the start_time of all workshops held on RubyConf Matz, I just
find RubyConf’s model instance and call
#convention_workshops.collect{|cw| cw.star_time} on it. But suppose we
want to select the start_time for only the workshops hosted by Matz on
RubyConf. How should we do that?

The SQL way would be:

SELECT convention_workshops.start_time
FROM convention_workshops, conventions, hosts
WHERE conventions.id=convention_workshops.convention_id
AND conventions.name=‘RubyConf’
AND convention_workshops.host_id=hosts.id
AND hosts.name=‘Matz’;

I’m pretty sure there shouldn’t be a need to write all that SQL for
Rails, especially after we bothered to set up all the association on
the code level. But what is it?

For each table you will need the models:

Table Model
workshops -> workshop.rb
conventions -> convention.rb
hosts -> host.rb

Relations:

Tables:
create table convetions_workshops (
workship_id int null,
convention_id int null,
host_id in null,
start_at datetime null
);

In workshop.rb:
has_and_belongs_to_many :conventions

In convention.rb
has_and_belongs_to_many :workshops

---- If you have all set up as above the query would be

Query:
@conventions = Convention.find(:all, :condition => “name = ‘RubyConf’”)
@conventions.workshops.select {|p| p.host.name = ‘Matz’}
or if you only want the times:
@conventions.workshops.select {|p| p.host.name = ‘Matz’}.collect{|t|
t.start_at}

The other way is to decompose the sql into ruby syntax which would look
like:
Convention.find(:all, :select =>"", :joins => “”, :conditions =>"")

Alder G. wrote:

For example, Matz may host an “Introduction to Ruby” workshop in
RubyConf. And _why may host a the same workshop in RailsConf. This
operates through the join-model convention_workshops, which has the
following rows: convention_id, workshop_id, host_id, start_time. It
belongs_to :convention, :workshop, and :host.

Associations make it very easy to select by one association. If I want
to get the start_time of all workshops held on RubyConf Matz, I just
find RubyConf’s model instance and call
#convention_workshops.collect{|cw| cw.star_time} on it. But suppose we
want to select the start_time for only the workshops hosted by Matz on
RubyConf. How should we do that?

The SQL way would be:

SELECT convention_workshops.start_time
FROM convention_workshops, conventions, hosts
WHERE conventions.id=convention_workshops.convention_id
AND conventions.name=‘RubyConf’
AND convention_workshops.host_id=hosts.id
AND hosts.name=‘Matz’;

I’m pretty sure there shouldn’t be a need to write all that SQL for
Rails, especially after we bothered to set up all the association on
the code level. But what is it?

You’re right that doing a SQL SELECT just for the start_time is not
idiomatic Rails. You’re on the right track with your collect() filter.
The trick is getting the join model rows based on two set foreign keys.
By the way, I suggest calling your join model something like
Presentation or Engagement, rather than the convention_workshops table
name which is probably too easy to confuse with a habtm join table.

in Host:
has_many :engagements
has_many :workshops, :through => :engagements
has_many :conventions, :through => :engagements

matz = Host.find_by_name(“Matz”)
matz_engagements = matz.engagements.find(:all, :conditions =>
[“convention_id = ?”, con_id])

Now that you have the engagements, you can ask each for the workshop and
its start_time.

If you don’t like the query with the conditions option, you can hide
that by using an association:

has_many :engagements do
def by_convention(convention)
find(:all, :conditions => [“convention_id = ?”, convention])
end
end

Then you’d write
matz_engagements = matz.engagements.by_convention(convention)

Of course you can do the symmetric thing in the Convention model.


Josh S.
http://blog.hasmanythrough.com

On 5/27/06, Roland M. [email protected] wrote:

create table convetions_workshops (
In convention.rb

Thanks, Roland

I’m weighing your select solution against an alternative suggestion by
Josh below. You’re welcome to chime in if interested :slight_smile:

On 5/27/06, Josh S. [email protected] wrote:

#convention_workshops.collect{|cw| cw.star_time} on it. But suppose we
AND hosts.name=‘Matz’;
name which is probably too easy to confuse with a habtm join table.

in Host:
has_many :engagements
has_many :workshops, :through => :engagements
has_many :conventions, :through => :engagements

matz = Host.find_by_name(“Matz”)
matz_engagements = matz.engagements.find(:all, :conditions =>
[“convention_id = ?”, con_id])

Actually, I’d be looking for engagements when only the convention’s
name (as a string) is known, so it would be:

matz.engagements.find(:all, :conditions => [“convention_id = ?”,

Convention.find_by_name(‘RubyConf’).id])

But there’s the alternative:

matz.engagements.select {|e| e.convention.name == “RailsConf”}

Which should I pick? Seems like the former might be less efficient,
entailing two queries instead of just one in the later. But I’m sure
you’d have a clearer idea about this than me.

Alder G. wrote:

Actually, I’d be looking for engagements when only the convention’s
name (as a string) is known, so it would be:

matz.engagements.find(:all, :conditions => [“convention_id = ?”,
Convention.find_by_name(‘RubyConf’).id])

But there’s the alternative:

matz.engagements.select {|e| e.convention.name == “RailsConf”}

Which should I pick? Seems like the former might be less efficient,
entailing two queries instead of just one in the later. But I’m sure
you’d have a clearer idea about this than me.

If you already know the convention’s name, wouldn’t you have the model
object for it already? Unless you are storing the name somewhere else
besides the Convention object, which would be redundant. So I don’t
think you’d need an extra query because you’d already have done it.

As for which to pick, only you can decide. You’ll have to look at your
performance data to see which is faster for your setup. Do whatever is
simplest for you, then see if you need to optimize it later.


Josh S.
http://blog.hasmanythrough.com

On 5/27/06, Alder G. [email protected] wrote:

operates through the join-model convention_workshops, which has the
The SQL way would be:
the code level. But what is it?

This assumes the workshop has_and_belongs_to_many conventions.
If ConventionWorkshop is a mode in its own right, you can do the
‘find’ on it directly, making it easier to get at the start time.

Workshop.find :all, :include => [:conventions, :hosts],
:conditions => [“conventions.name = ? and hosts.name = ?”, ‘RubyConf’,
‘Matz’]

If ConventionWorkshop is a real model, you can say:

ConventionWorkshop.find :all, :include => [:hosts, :conventions], etc
etc

On 5/27/06, Josh S. [email protected] wrote:

Which should I pick? Seems like the former might be less efficient,
entailing two queries instead of just one in the later. But I’m sure
you’d have a clearer idea about this than me.

If you already know the convention’s name, wouldn’t you have the model
object for it already?

The convention-name string comes from user input.


As for which to pick, only you can decide. You’ll have to look at your
performance data to see which is faster for your setup. Do whatever is
simplest for you, then see if you need to optimize it later.

Yes. I was wondering since I’m still vague about how the various
method calls translate to actual database queries.

Assuming my newbish eyes haven’t missed some huge implementation
detail of AR, I guess your method should be better for large datasets,
especially if we only need to filter in a very small subset of it.
However, the #select method would be better if the dataset is
reasonably small and we later need some records we filtered out on the
first operation - which would necessitate more queries if filtered on
the database level.

Thanks for helping me again, Josh. I’ve read your post about has_many
:through, and in fact that post was the one which led me to use the
join-models mentioned above. Keep up the good work.

Regards,
-Alder