Complex query with multiple joins

Hi!

I have an app that manages products. I import the products from several
resellers and they all name their categories different. Because of this
I
have resellercategories that are mapped to my own subcategories.

Categories

  • Subcategories (belongs_to Category)

Resellercategories (belongs_to Subcategory)

Products (belongs_to Resellercategory)

Now I want to show the categories in a special way. If the user filters
products by gender, let’s say ‘female’, then I only want to show the
categories and subcategories which have products for females. This is
where
I’m stuck!

I created a query like this that actually gets the correct “Categories”.
But it doesn’t seem restrict the subcategories.

The
query:
http://snipt.net/Linuus/categories-1?key=2d5d54fd573f0afe60eaa3c47a23fd4d

Any ideas?

Regards
Linus

Man, I’m beating my head against this one…

The query above is probably not the best. Any other suggestions?

Regards
Linus

On Fri, Jan 6, 2012 at 2:13 AM, Linus P.
<[email protected]

wrote:

But it doesn’t seem restrict the subcategories.

The query:
http://snipt.net/Linuus/categories-1?key=2d5d54fd573f0afe60eaa3c47a23fd4d

How did you actually retrieve the subcategories.

By using the “joins” there, there is a chance your query only retrieves
the categories
(as in "SELECT “categories”.* FROM “categories” INNER JOIN
“subcategories”
…).
So maybe you have a second query for the subcategories (that is not
correctly filtered then)

Maybe you do category.subcategories

@menu_categories = Category.joins(:subcategories)
.joins(“INNER JOIN resellercategories AS r ON
subcategories.id = r.subcategory_id”)
.joins(“INNER JOIN products AS p ON r.id =
p.resellercategory_id”)
.group(“categories.id”)
.order(“categories.name ASC”)
.where(“p.gender = ‘unisex’ OR p.gender = ?”,
session[:gender])

@menu_categories.each do |mc|
mc.subcategories

end

This mc.subcategories executes a new query that is not taking into
account the filtering.
So, you are doing 1+n queries (and the n queries are not what you
expect).

Maybe you need

@menu_categories = Category.includes(:subcategories) # INCLUDES here
.joins(“INNER JOIN resellercategories AS r ON
subcategories.id = r.subcategory_id”)
.joins(“INNER JOIN products AS p ON r.id =
p.resellercategory_id”)
.group(“categories.id”)
.order(“categories.name ASC”)
.where(“p.gender = ‘unisex’ OR p.gender = ?”,
session[:gender])

That will get the subcategories from the db in 1 (more complex) query
with
results that are
filtered. If you then do

@menu_categories.each do |mc|
mc.subcategories # this should not trigger new SQL queries

end

Do this rails console and read the SQL carefully (or look in your
development log
carefully which SQL is executed).

HTH,

Peter


Peter V.
http://twitter.com/peter_v
http://rails.vandenabeele.com

Thank you Peter. I also found the related section in the rails guides
about
this:

However, I haven’t gotten it to work just yet. The above example you
wrote
doesn’t work because that subcategories.id is not present when I try to
join the next table:

@menu_categories = Category.includes(:subcategories) # INCLUDES here
.joins(“INNER JOIN resellercategories AS r ON
subcategories.id = r.subcategory_id”) # no such column: subcategories.id
.joins(“INNER JOIN products AS p ON r.id =
p.resellercategory_id”)
.group(“categories.id”)
.order(“categories.name ASC”)
.where(“p.gender = ‘unisex’ OR p.gender = ?”,
session[:gender])

Also, just to be clear. You are totally right that I just do
“mc.subcategories.each”…

This is what the view looks like:
http://snipt.net/Linuus/categories-view-1?key=8f1be321f6253bd74847066a719490ce

Regards
Linus

With any ORM there will be instances where pure object relationships
make for either very complex code or code that does not scale well
performance-wise as the data set involved grows when compared to the
efficiency of the underlying database’s SQL engine.

In those circumstances remember that views and stored procedures are
available - schema_plus is a gem that let’s you define views as a part
of your schema.r b file and while stored procedures violate the
ActiveRecord design pattern they can be used with AR and are sometimes
necessary in order to create a system that scales and performs well.

Max

On Fri, Jan 6, 2012 at 4:19 PM, Linus P.
<[email protected]

wrote:

@menu_categories = Category.includes(:**subcategories) # INCLUDES here
.joins(“INNER JOIN resellercategories AS r ON
subcategories.id = r.subcategory_id”) # no such column: subcategories.id
.joins(“INNER JOIN products AS p ON r.id =
p.resellercategory_id”)
.group(“categories.id”)
.order(“categories.name ASC”)
.where(“p.gender = ‘unisex’ OR p.gender = ?”,
session[:gender])

Well, if you look at the generated SQL, there must be

LEFT OUTER JOIN “subcategories” …
is there an “AS xyz” then you need to use that xyz.

You say you have the “belongs_to” relationships. Did you actually write
the
has_many relationships ?

Maybe show us some of the generated SQL in the log/development.log or in
the rails console.

HTH,

Peter


Peter V.
http://twitter.com/peter_v
http://rails.vandenabeele.com

(I wrote an answer before but it seems to not have been published so
here
we go again :slight_smile: )

No, there is no LEFT OUTER JOIN subcategories… I guess this is because
I
don’t have any condition on the subcategories. Then it will execute two
queries instead of using the LEFT OUTER JOIN.

I just ran this:

Category.includes(:subcategories)
.joins(“INNER JOIN resellercategories AS r ON
subcategories.id
= r.subcategory_id”)
.joins(“INNER JOIN products AS p ON r.id =
p.resellercategory_id”)
.group(“categories.id”)
.order(“categories.name ASC”)
.where(“p.gender = ‘unisex’ OR p.gender = ‘male’”)

And it generated this SQL query:

“SELECT “categories”.* FROM “categories” INNER JOIN resellercategories
AS r
ON subcategories.id =
r.subcategory_id INNER JOIN products AS p ON r.id =
p.resellercategory_id
WHERE (p.gender = ‘unisex’ OR p.gender = ‘male’)
GROUP BY categories.id ORDER BY categories.name ASC”

Still doesn’t work :frowning:

Generated SQL:
" SELECT “categories”.* FROM “categories” INNER JOIN resellercategories
AS
r ON subcategories.id =
r.subcategory_id INNER JOIN products AS p ON r.id =
p.resellercategory_id
WHERE (p.gender = ‘unisex’ OR p.gender = ‘male’)
AND (subcategories.id > 0) GROUP BY categories.id ORDER BY
categories.name
ASC"

Error:
“ActiveRecord::StatementInvalid: SQLite3::SQLException: no such column:
subcategories.id: SELECT “categories”.* FROM “categ
ories” INNER JOIN resellercategories AS r ON subcategories.id =
r.subcategory_id INNER JOIN products AS p ON r.id = p.rese
llercategory_id WHERE (p.gender = ‘unisex’ OR p.gender = ‘male’) AND
(subcategories.id > 0) GROUP BY categories.id ORDER B
Y categories.name ASC”

On Fri, Jan 6, 2012 at 7:58 PM, Linus P.
<[email protected]

wrote:

“SELECT “categories”.* FROM “categories” INNER JOIN resellercategories AS
r ON subcategories.id =
r.subcategory_id INNER JOIN products AS p ON r.id = p.resellercategory_id
WHERE (p.gender = ‘unisex’ OR p.gender = ‘male’)
GROUP BY categories.id ORDER BY categories.name ASC”

You are correct … I just retested here and the single query is only
triggered when
there is a condition on the associated table in the :includes

So, just to test that hypothesis … what happens if you add add a fake
condition on
the subcategories.

Category.includes(:subcategories)
.joins(“INNER JOIN resellercategories AS r ON
subcategories.id= r.subcategory_id”)
.joins(“INNER JOIN products AS p ON r.id =
p.resellercategory_id”)
.group(“categories.id”)
.order(“categories.name ASC”)
.where(“p.gender = ‘unisex’ OR p.gender = ‘male’”)
.where(“subcategories.id > 0”) # ADDED A FAKE CONDITION

Peter


Peter V.
http://twitter.com/peter_v
http://rails.vandenabeele.com

Hi Colin!

Absolutely. You mean for the models?

Here they are:
http://snipt.net/Linuus/category-and-subcategory?key=38ba590408ac4233927a06046eeca30d

Best Regards
Linus

On 6 January 2012 20:57, Linus P. [email protected]
wrote:

Hi Colin!

Absolutely. You mean for the models?

Here they are:

http://snipt.net/Linuus/category-and-subcategory?key=38ba590408ac4233927a06046eeca30d

That looks ok.
I think the group and order should be category.id and .name rather
than categories but I don’t see how that would cause the problem you
are seeing.

Colin

On 6 January 2012 20:18, Linus P. [email protected]
wrote:

Error:
“ActiveRecord::StatementInvalid: SQLite3::SQLException: no such column:
subcategories.id: SELECT “categories”.* FROM “categ
ories” INNER JOIN resellercategories AS r ON subcategories.id =
r.subcategory_id INNER JOIN products AS p ON r.id = p.rese
llercategory_id WHERE (p.gender = ‘unisex’ OR p.gender = ‘male’) AND
(subcategories.id > 0) GROUP BY categories.id ORDER B
Y categories.name ASC”

Can you show us the class definitions for Category and Subcategory
(snip the methods).

Colin

Wow! Thank you for all the help Peter! I really appreciate it. I will
test
that code tomorrow as it’s getting quite late here.

I did a small test now though. It seems that you can’t mix joins and
includes.

This doesn’t work:
Category.includes(:subcategories).joins(“INNER JOIN resellercategories
AS r
ON subcategories.id = r.s
ubcategory_id”).where(“subcategories.id > 0”)

“SELECT “categories”.* FROM “categories” INNER JOIN resellercategories
AS r
ON subcategories.id =
r.subcategory_id WHERE (subcategories.id > 0)
ActiveRecord::StatementInvalid: SQLite3::SQLException: no such column:
subcategories.id: SELECT “categories”.* FROM “categ
ories” INNER JOIN resellercategories AS r ON subcategories.id =
r.subcategory_id WHERE (subcategories.id > 0)”

But this do work:
Category.includes(:subcategories).where(“subcategories.id > 0”)

“SELECT “categories”.“id” AS t0_r0, “categories”.“name” AS t0_r1,
“categories”.“created_at” AS t0_r2, “categ
ories”.“updated_at” AS t0_r3, “categories”.“permalink” AS t0_r4,
“subcategories”.“id” AS t1_r0, “subcategories”.“name” AS
t1_r1, “subcategories”.“category_id” AS t1_r2,
“subcategories”.“created_at”
AS t1_r3, “subcategories”.“updated_at” AS t1_r
4, “subcategories”.“permalink” AS t1_r5 FROM “categories” LEFT OUTER
JOIN
“subcategories” ON “subcategories”.“category_id”
= “categories”.“id” WHERE (subcategories.id > 0)”

Quite odd I think…

Regards
Linus

On Fri, Jan 6, 2012 at 10:10 PM, Colin L. [email protected]
wrote:

That looks ok.
I think the group and order should be category.id and .name rather
than categories but I don’t see how that would cause the problem you
are seeing.

OK, let’s try over and keep it as simple as possible.

Rails 3.1.3.

This is a rails console session that reproduces the original problem and
shows the fix. Maybe Linus can try to reproduce this exactly and then
add more complexity until it breaks.

These are my models:

peterv@ASUS:~/data/backed_up/rails-apps/apps/base_app/app/models$ cat
user.rb
class User < ActiveRecord::Base
belongs_to :account
end
peterv@ASUS:~/data/backed_up/rails-apps/apps/base_app/app/models$ cat
account.rb
class Account < ActiveRecord::Base
has_many :users
end

The data is:

Account 1
=> user 1 “Peter”
=> user 2 “Sarah” (one of my kids)

Account 2

user 3 “Thomas”

This is the console with my comments:

$ rails console

016:0> a1 = Account.joins(:users).where(“users.name LIKE ‘peter’”)
Account Load (1.0ms) SELECT “accounts”.* FROM “accounts” INNER JOIN
“users” ON “users”.“account_id” = “accounts”.“id” WHERE (users.name LIKE
‘peter’)
=> [#<Account id: 1, number: “123”, created_at: “2011-12-01 22:24:16”,
updated_at: “2011-12-01 22:24:16”>]

This is the Relation with JOINS. It only loads the first SQL query and
only
SELECTs the accounts (not the users). We find the correct Account 1 with
an associated user “Peter”.

017:0> a1.size
=> 1

There is 1 account found.

The “first” below is to take that 1 account from the Relation

018:0> a1.first.users
User Load (0.8ms) SELECT “users”.* FROM “users” WHERE
“users”.“account_id” = 1
=> [#<User id: 1, name: “peter”, account_id: 1, created_at: “2011-12-01
22:24:16”, updated_at: “2011-12-01 22:24:16”>, #<User id: 3, name:
“Sarah”,
account_id: 1, created_at: “2012-01-06 21:23:22”, updated_at:
“2012-01-06
21:23:22”>]

Only now we do the second SQL query for the users from account 1.
NOT filtered on users.name, so we find ALL users for account 1:

  • Peter (OK)
  • Sarah (not OK)
    that was your original problem.

019:0> a1 = Account.includes(:users).where(“users.name LIKE ‘peter’”)
SQL (1.1ms) SELECT “accounts”.“id” AS t0_r0, “accounts”.“number” AS
t0_r1, “accounts”.“created_at” AS t0_r2, “accounts”.“updated_at” AS
t0_r3,
“users”.“id” AS t1_r0, “users”.“name” AS t1_r1, “users”.“account_id” AS
t1_r2, “users”.“created_at” AS t1_r3, “users”.“updated_at” AS t1_r4 FROM
“accounts” LEFT OUTER JOIN “users” ON “users”.“account_id” =
“accounts”.“id” WHERE (users.name LIKE ‘peter’)
=> [#<Account id: 1, number: “123”, created_at: “2011-12-01 22:24:16”,
updated_at: “2011-12-01 22:24:16”>]

Now I get the “single” , “complex” query that loads the accounts columns
and the associated
users columns in 1 query; and now only the users are included that also
match the filter
conditions (name LIKE ‘peter’).

020:0> a1.size
=> 1

Again, 1 account is found

021:0> a1.first.users
=> [#<User id: 1, name: “peter”, account_id: 1, created_at: “2011-12-01
22:24:16”, updated_at: “2011-12-01 22:24:16”>]

But now the users are not loaded from a second query but are already in
memory from the first
“complex” query.

Could you try to literally reproduce this behavior, so we have a common
ground to start ?

Or maybe you can simplify your code to the point of only having
categories
and subcategories
in the query? That should work …

These are the migrations:

peterv@ASUS:~/data/backed_up/rails-apps/apps/base_app/db/migrate$ cat *
class CreateAccounts < ActiveRecord::Migration
def change
create_table :accounts do |t|
t.string :number

  t.timestamps
end

end
end

class CreateUsers < ActiveRecord::Migration
def change
create_table :users do |t|
t.string :name
t.integer :account_id

  t.timestamps
end

end
end

HTH,

Peter

Hi again!

I posted a question about this issue on StackOverflow and got an
interesting response. To force Rails to join the included table you can
use
eager_load() instead of includes(). Using eager_load() and removing the
group() seems to make my query work perfect.

Correct version:

Category.eager_load(:subcategories)
.joins(“INNER JOIN resellercategories AS r ON
subcategories.id
= r.subcategory_id”)
.joins(“INNER JOIN products AS p ON r.id =
p.resellercategory_id”)
.order(“categories.name ASC”)
.where(“p.gender = ‘unisex’ OR p.gender = ‘male’”)

Here’s the discussion at SO:

Thanks for all the help!

Best Regards
Linus

On Sat, Jan 7, 2012 at 12:20 AM, Linus P. <
[email protected]> wrote:

Wow! Thank you for all the help Peter! I really appreciate it.

No prob, thank you for getting closer to the root cause.

I will test that code tomorrow as it’s getting quite late here.

I did a small test now though. It seems that you can’t mix joins and
includes.

Indeed, strange …

subcategories.id: SELECT “categories”.* FROM “categ
“subcategories”.“id” AS t1_r0, “subcategories”.“name” AS
t1_r1, “subcategories”.“category_id” AS t1_r2,
“subcategories”.“created_at” AS t1_r3, “subcategories”.“updated_at” AS t1_r
4, “subcategories”.“permalink” AS t1_r5 FROM “categories” LEFT OUTER JOIN
“subcategories” ON “subcategories”.“category_id”
= “categories”.“id” WHERE (subcategories.id > 0)”

Indeed. I added a Project class with User has_many :projects to my test
project and can confirm:

010:0> a = Account.includes(:users).where(“users.name like ‘peter’”)
SQL (1.1ms) SELECT “accounts”.“id” AS t0_r0, “accounts”.“number” AS
t0_r1, “accounts”.“created_at” AS t0_r2, “accounts”.“updated_at” AS
t0_r3,
“users”.“id” AS t1_r0, “users”.“name” AS t1_r1, “users”.“account_id” AS
t1_r2, “users”.“created_at” AS t1_r3, “users”.“updated_at” AS t1_r4 FROM
“accounts” LEFT OUTER JOIN “users” ON “users”.“account_id” =
“accounts”.“id” WHERE (users.name like ‘peter’)
=> [#<Account id: 1, number: “123”, created_at: “2011-12-01 22:24:16”,
updated_at: “2011-12-01 22:24:16”>]
011:0> a1 = a.first
=> #<Account id: 1, number: “123”, created_at: “2011-12-01 22:24:16”,
updated_at: “2011-12-01 22:24:16”>
012:0> a1.users.first
=> #<User id: 1, name: “peter”, account_id: 1, created_at: “2011-12-01
22:24:16”, updated_at: “2011-12-01 22:24:16”>

This works as expected. 1 large query for accounts and associated users.

013:0> a1.users.first.projects
Project Load (1.0ms) SELECT “projects”.* FROM “projects” WHERE
“projects”.“user_id” = 1
=> [#<Project id: 1, name: “project_of_peter”, user_id: 1, created_at:
“2012-01-07
10:51:51”, updated_at: “2012-01-07 10:51:51”>]

Also as expected. Since “projects” was not joined or included in the
original query,
we need a second query here (which is then not filtered and thus not the
results we want).

014:0> a = Account.includes(:users).where(“users.name like
‘peter’”).joins(:users => :projects)
Account Load (1.3ms) SELECT “accounts”.* FROM “accounts” INNER JOIN
“users” ON “users”.“account_id” = “accounts”.“id” INNER JOIN “projects”
ON
“projects”.“user_id” = “users”.“id” WHERE (users.name like ‘peter’)
User Load (0.9ms) SELECT “users”.* FROM “users” WHERE
“users”.“account_id” IN (1)
=> [#<Account id: 1, number: “123”, created_at: “2011-12-01 22:24:16”,
updated_at: “2011-12-01 22:24:16”>]

This is the getting closer to the root cause as you discovered now. As
soon
as there is a “joins” added,
the :includes fails to Eagerly load only the filtered users in a
single
complex query, but goes to a second
query (that is not filtered on users; that would be “not filtered on
subcategories” in your code).

015:0> a = Account.includes(:users).where(“users.name like
‘peter’”).includes(:users => :projects)
SQL (1.5ms) SELECT “accounts”.“id” AS t0_r0, “accounts”.“number” AS
t0_r1, “accounts”.“created_at” AS t0_r2, “accounts”.“updated_at” AS
t0_r3,
“users”.“id” AS t1_r0, “users”.“name” AS t1_r1, “users”.“account_id” AS
t1_r2, “users”.“created_at” AS t1_r3, “users”.“updated_at” AS t1_r4,
“projects”.“id” AS t2_r0, “projects”.“name” AS t2_r1,
“projects”.“user_id”
AS t2_r2, “projects”.“created_at” AS t2_r3, “projects”.“updated_at” AS
t2_r4 FROM “accounts” LEFT OUTER JOIN “users” ON “users”.“account_id” =
“accounts”.“id” LEFT OUTER JOIN “projects” ON “projects”.“user_id” =
“users”.“id” WHERE (users.name like ‘peter’)
=> [#<Account id: 1, number: “123”, created_at: “2011-12-01 22:24:16”,
updated_at: “2011-12-01 22:24:16”>]
016:0> a.size
=> 1
017:0> a.first.users
=> [#<User id: 1, name: “peter”, account_id: 1, created_at: “2011-12-01
22:24:16”, updated_at: “2011-12-01 22:24:16”>]
018:0> a.first.users.first.projects
=> [#<Project id: 1, name: “project_of_peter”, user_id: 1, created_at:
“2012-01-07
10:51:51”, updated_at: “2012-01-07 10:51:51”>]

So, using :includes everywhere seems to not trigger this problem.

I don’t know if that is an acceptable fix for your project (depending on
much
unneeded data that loads for you)?

I don’t know if this is a “bug” or just misunderstanding the behavior of
:joins
and :includes in Eager loading from our side?

If I assume the latter (us not understanding it well enough), then
conclusions
could be:

  • :includes is just a convenience for eager loading (performance
    improvement)
    and not designed to work together with joins in the way we expected
  • trying to use has_many relationships with conditions may be a solution

First added a “lang” column to projects and trying a condition on the
projects table
(fails in the same way):

012:0> a = Account.includes(:users).joins(:users =>
:projects).where([‘projects.lang = ?’, ‘ruby’])
Account Load (1.2ms) SELECT “accounts”.* FROM “accounts” INNER JOIN
“users” ON “users”.“account_id” = “accounts”.“id” INNER JOIN “projects”
ON
“projects”.“user_id” = “users”.“id” WHERE (projects.lang = ‘ruby’)
User Load (0.5ms) SELECT “users”.* FROM “users” WHERE
“users”.“account_id” IN (1)
=> [#<Account id: 1, number: “123”, created_at: “2011-12-01 22:24:16”,
updated_at: “2011-12-01 22:24:16”>]

This is still correct. Only 1 account with id 1 has a user (peter) with
a
project with lang “ruby”.

013:0> a1 = a.first
=> #<Account id: 1, number: “123”, created_at: “2011-12-01 22:24:16”,
updated_at: “2011-12-01 22:24:16”>

Getting that first (and only) account from the array.

014:0> a1.users
=> [#<User id: 1, name: “peter”, account_id: 1, created_at: “2011-12-01
22:24:16”, updated_at: “2012-01-07 11:28:20”, gender: “M”>, #<User id:
3,
name: “Sarah”, account_id: 1, created_at: “2012-01-06 21:23:22”,
updated_at: “2012-01-07 11:28:24”, gender: “F”>]

But this is not what we want. We get all users from account 1, not just
the
users with a “ruby” project.

015:0> a1.users.first.projects
Project Load (0.9ms) SELECT “projects”.* FROM “projects” WHERE
“projects”.“user_id” = 1
=> [#<Project id: 1, name: “project_of_peter”, user_id: 1, created_at:
“2012-01-07
10:51:51”, updated_at: “2012-01-07 11:38:15”, lang: “ruby”>, #<Project
id:
2, name: “coffeescript_project”, user_id: 1, created_at: “2012-01-07
11:39:10”,
updated_at: “2012-01-07 11:39:10”, lang: “coffee”>]

And similar problem, for user ‘peter’ we get all projects, also the ones
in
a different language.

016:0> a1.users[1].projects
Project Load (0.8ms) SELECT “projects”.* FROM “projects” WHERE
“projects”.“user_id” = 3
=> []

And user ‘sarah’ has not projects at all, but still in the list of
users.

OK, trying another solution …

class Account < ActiveRecord::Base
has_many :users

attr_accessor :lang

has_many :users_by_project_lang,
:class_name => “User”,
:include => :projects,
:conditions => Proc.new { [“projects.lang = ?”, lang] }
end

I add a special has_many relationship on the Account class
(that would be your Category class). Now using that to
find the “users” LIMITED to a certain project_lang
(in your case that would be subcategories, limited to a certain
product.gender).

018:0> a1.lang=‘ruby’
=> “ruby”

The conditions will be late evaluated against the “self” (that is a1
here).
So
we need to set it on a1.

019:0> a1.users_by_project_lang
SQL (1.5ms) SELECT “users”.“id” AS t0_r0, “users”.“name” AS t0_r1,
“users”.“account_id” AS t0_r2, “users”.“created_at” AS t0_r3,
“users”.“updated_at” AS t0_r4, “users”.“gender” AS t0_r5,
“projects”.“id”
AS t1_r0, “projects”.“name” AS t1_r1, “projects”.“user_id” AS t1_r2,
“projects”.“created_at” AS t1_r3, “projects”.“updated_at” AS t1_r4,
“projects”.“lang” AS t1_r5 FROM “users” LEFT OUTER JOIN “projects” ON
“projects”.“user_id” = “users”.“id” WHERE “users”.“account_id” = 1 AND
(projects.lang = ‘ruby’)
=> [#<User id: 1, name: “peter”, account_id: 1, created_at: “2011-12-01
22:24:16”, updated_at: “2012-01-07 11:28:20”, gender: “M”>]

Now we get back our “complex” query, filtered on a parameter of the the
joined products table.

021:0> a1.users_by_project_lang.each{|user| puts “user = #{user} with
projects #{user.projects}”}
user = #User:0x94cd7c8 with projects [#<Project id: 1, name:
“project_of_peter”, user_id: 1, created_at: “2012-01-07 10:51:51”,
updated_at: “2012-01-07 11:38:15”, lang: “ruby”>]
=> [#<User id: 1, name: “peter”, account_id: 1, created_at: “2011-12-01
22:24:16”, updated_at: “2012-01-07 11:28:20”, gender: “M”>]

And now, only the correct users (‘peter’) with the filtered projects
(with
lang ‘ruby’) are returned.

So, I presume, if you make a similar has_many in Category

has_many :subcategories_by_product_gender

this would work.

But … since the has_many only has a :include (and not a :joins)
options,
this seems
equivalent to the first option of using all :includes for the entire
chain
(and
no :joins). So, in reality, it does not seem to be better than that
first
option
(which has the disavantage of loading all the associated info, even if
you
so not need it).

An alternative might be to make a dedicated has_many and use :finder_sql
to hand-code the SQL …

“:finder_sql
Specify a complete SQL statement to fetch the association. This is a
good
way
to go for complex associations that depend on multiple tables.”

It seems this feature request seems complex enough to warrant that.

I would certainly be interested in a “proper” solution for this issue
(using a
mix of :includes and :joins, where the :includes are eagerly loaded in
the SELECT part in the first query (so these results are filtered) and
the
:joins are only used to eveluate conditions or ordering, etc. but not
have
to be included in the SELECT part).

I looked into scopes, but I failed to find a proper way to apply a scope
to
a has_many relationship … (maybe with_scope {}, but that seems to be
depricated ?).

HTH,

Peter


Peter V.
http://twitter.com/peter_v
http://rails.vandenabeele.com

On Sat, Jan 7, 2012 at 7:07 PM, Linus P.
<[email protected]

wrote:

Hi again!

I posted a question about this issue on StackOverflow and got an
interesting response. To force Rails to join the included table you can use
eager_load() instead of includes(). Using eager_load() and removing the
group() seems to make my query work perfect.

Thanks, good to know.

It seems not extensively documented …

Google (“ActiveRecord eager_load”) results:

no match for eager_load

  1. Active Record Query Interface — Ruby on Rails Guides
    no match for eager_load

  2. eager_load (ActiveRecord::QueryMethods) - APIdock
    a match :slight_smile: => This method has no description. You can help the Ruby on
    Rails community by adding new notes.

etc …

I will see if I can contribute some doc for this,

Peter