Looking for comments on a rails patch, lazy association instantiation

Prompted by some discussion on rails-talk last week, I worked up a
patch to add an option to ActiveRecord::Base#find which allows
associations to be :included for purposes of the where clause, but
defers instantiating the included association objects.

The use case is where there are a large number of association rows
returned by the query, the vast majority will not be used since they
are only being used to select a small set of root objects.

Here’s the ticket with the patch:
http://dev.rubyonrails.org/ticket/9923


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Sorry I intended this to go to the rails-talk list but gmail outsmarted
me.

On 10/24/07, Rick DeNatale [email protected] wrote:

http://dev.rubyonrails.org/ticket/9923


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On Oct 24, 2007, at 3:55 PM, Rick DeNatale wrote:

Here’s the ticket with the patch:
http://dev.rubyonrails.org/ticket/9923

I don’t mean to be obtuse, but I don’t understand why a new feature
is needed. The :joins option works fine for me for finding records
based on joins to other tables. Inner joins are a better fit for
selection conditions when you don’t want the eagerly loaded records.
What you are doing with your approach is an outer join (since that’s
how eager loading works). That includes all the result data for the
eagerly loaded records, which you’ll be dropping on the floor and
ignoring. That’s making the database work a lot more for no reason.

If you wanted something friendlier than a SQL joins clause string,
like the hash/array syntax used by the :include option, that could be
a good alternate syntax for :joins. Much more Rails-like too. I’d
be interested in seeing something that lets you do:

Post.find(:all, :joins => {:comments => :author}, :conditions =>
[“authors.name = ?”, name]

That should generate SQL

SELECT posts.* FROM posts
INNER JOIN comments ON post.id = comments.post_id
INNER JOIN authors ON comment_id = authors.id
WHERE authors.name = ‘Bob Dobbs’;

It’s a radically different approach than yours in how you’d do it in
ActiveRecord code, so it might not be something you want to take on.
If not, I might give it a shot in my copious spare time.

Disclaimer: It’s late on a Friday. Take what I say with a few grains
of salt!


Josh S.
http://blog.hasmanythrough.com

On 10/26/07, Josh S. [email protected] wrote:

If you wanted something friendlier than a SQL joins clause string,
like the hash/array syntax used by the :include option, that could be
a good alternate syntax for :joins. Much more Rails-like too. I’d
be interested in seeing something that lets you do:

Post.find(:all, :joins => {:comments => :author}, :conditions =>
[“authors.name = ?”, name]

Okay, I just reread your post, and this seems to be the nugget.

You’re proposing an extension of the value of the :joins option which
if used would trigger the same kind of object identity preservation of
the instances of the AR class receiving the find message.

That makes sense I think.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On 10/26/07, Josh S. [email protected] wrote:

returned by the query, the vast majority will not be used since they
What you are doing with your approach is an outer join (since that’s
how eager loading works). That includes all the result data for the
eagerly loaded records, which you’ll be dropping on the floor and
ignoring. That’s making the database work a lot more for no reason.

SELECT posts.* FROM posts
INNER JOIN comments ON post.id = comments.post_id
INNER JOIN authors ON comment_id = authors.id
WHERE authors.name = ‘Bob Dobbs’;

It’s a radically different approach than yours in how you’d do it in
ActiveRecord code, so it might not be something you want to take on.
If not, I might give it a shot in my copious spare time.

Well your SQL foo is no doubt stronger than mine, but in my use case I
tried :join and it didn’t work.

Here’s my motivating case:

I’ve got one model called Item which has_many Schedules. Schedules
have a start-time.
There are LOTS or schedules.

I’m trying to get all of the Items which have at least one Schedule
which hasn’t yet started, the purpose of this is to populate a
selection list.

Once one of those is selected, the list of schedules owned by that
Item is presented for selection.

So, it was suggested to me here that

Items.find(:all, :include => :schedules :conditions =>
[“schedules.begin_date >= ?”, horizon_time]

was the best way to do this.

It works, but since there are many, many schedules in this database,
instantiating all their AR objects is costly in time and space.

Now I tried substituting :joins and doing an inner join. The result
was that I got duplicate AR objects for my Items.

The deal is that when you use :include, AR does the query, and then
iterates over the result and instantiates only one object for each
unique row in the first table. It then adds the objects for the
included tables to the associations.

The join option does the query and just builds one AR object from each
row of the result which isn’t what I need.

Disclaimer: It’s late on a Friday. Take what I say with a few grains
of salt!

It’s even later here, (you are on the west coast aren’t you. )


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/