Find eager loads correct records but generates excess queries (Rails 2.1)


#1

Models:
S :has_many F; F :belongs_to P; S :has_many P :through F.

Query looks like this:
S.find_all_by_key([array of keys], :include => [:F, :P])

This produces the correct and expected result, but very slowly. On the
log:
SELECT * FROM S WHERE KEY IN (keys)
SELECT * FROM F WHERE S_ID IN (S.ids from first query)
SELECT * FROM F WHERE S_ID IN (S.ids from first query) [same query
twice]
SELECT * FROM P WHERE ID IN (ids from second query)

At this point all the required data has been retrieved and Rails is
looking good. Time to quit. However:
SELECT * FROM P WHERE ID = first id from second query
SELECT * FROM P WHERE ID = second id from second query
… [typically about 50-200 queries, one query per P.ID]

Q1. Why so many extraneous queries?
Q2. Is there an easy fix, or is this just too hard for Rails?
Q3. In general, is it possible to control eager loading at a per-query
level?


#2

On Apr 27, 3:18 pm, davidy removed_email_address@domain.invalid wrote:

SELECT * FROM F WHERE S_ID IN (S.ids from first query) [same query
twice]
SELECT * FROM P WHERE ID IN (ids from second query)

At this point all the required data has been retrieved and Rails is
looking good. Time to quit. However:
SELECT * FROM P WHERE ID = first id from second query
SELECT * FROM P WHERE ID = second id from second query
… [typically about 50-200 queries, one query per P.ID]

That’s odd. is that from your eager loading or from your subsequent
use of the data ( also anonymising the table names to S, F, P makes it
rather hard to read than if there were meaningful names) (in
particular eager loading is unidirectional - if person has_many posts,
then Person.find(:all, :include => :posts) doesn’t eager load
post.person

Fred


#3

All SQL queries are generated by that one, single find() call. This is
really scary – there is nothing in the documentation to suggest that
it can do this kind of “fake join”, or the performance impact it can
have.

Sorry about the anonymisation, but the real table names would only
confuse things worse.

I suspect this is a bug/feature of Rails eager loading, caused by the
“triangular” relationships. S loads F using has_many, then F loads P
using belongs_to, then S loads P again using has_many through. It
“should” be smart enough not to, but that logic isn’t in there. The
documentation is absolutely no help, and the source code is way too
complicated for this bear.

Q3 is the answer. I fixed the problem (with dramatic performance
improvement) using :include {hash}, like so.
S.find_all_by_key([array of keys], :include => {:F => :P})

Magic!