Polymorphic join table not loading with includes

I am trying to optimize a slow query with an includes statement.

But a join table with a polymorphic association seem to be preventing
it.

The following gist tries to illustrate the
issue: Problems joining through polymorphic association with active record · GitHub

When running the code from the gist it will fail to load the join table
and
therefore fail.

Any help would be greatly appreciated.

On Wednesday, January 13, 2016 at 9:59:34 PM UTC, Flemming Thesbjerg
wrote:

When running the code from the gist it will fail to load the join table
and therefore fail.

Any help would be greatly appreciated.

I don’t think includes can help you here. The preload strategy can’t
work
here, because your a_comments association has a condition on the join
table. The include (join based strategy) doesn’t work because the
association is polymorphic and you can’t write a sql query that joins
tables conditionally. It’s possible that you could get this to run in 2
steps , if you put the condition on the post artefacts association and
made
a_comments go through that association. You might then be able to use
includes to load the scoped post artefacts association and then invoke
the
preloader manually to load the comments
(See

)

Fred

On Wednesday, 13 January 2016 16:59:34 UTC-5, Flemming Thesbjerg wrote:

When running the code from the gist it will fail to load the join table
and therefore fail.

Any help would be greatly appreciated.

It would be helpful to see the exact SQL generated when the tests in
that
Gist run. People can run the example, but it’s an extra step.

The query fails because the post_artifacts table isn’t joined. This is
an
expected behavior of includes; it chooses between a preload (which
requires one additional query but fewer joins) and an eager load (which
widens the query with a join). To do this, it relies on checking to see
if
the included tables are referenced in the SQL. Putting conditions on the
join table (post_artifacts, here) in a through association (a_comments,
here) without hinting will cause this behavior.

There are at least two ways to work around this problem:

  • first alternative: explicitly build an association of PostArtifacts
    that
    have the condition applied. Change the Post model to:

class Post < ActiveRecord::Base
has_many :post_artifacts
has_many :comments, through: :post_artifacts, source: :artifact,
source_type: ‘Comment’
has_many :a_post_artifacts, -> { a }, class_name: ‘PostArtifact’
has_many :a_comments, through: :a_post_artifacts, source: :artifact,
source_type: ‘Comment’
end

This moves the condition to a place where ActiveRecord understands the
post_artifacts table will be referenced when preloading a_comments.
The
resulting SQL looks like:

SELECT “posts”.* FROM “posts”
SELECT “post_artifacts”.* FROM “post_artifacts” WHERE
“post_artifacts”.“rule” = ? AND “post_artifacts”.“artifact_type” = ? AND
“post_artifacts”.“post_id” = 1 [[“rule”, 0], [“artifact_type”,
“Comment”]]
SELECT “comments”.* FROM “comments” WHERE “comments”.“id” IN (1, 2, 3,
4,
5, 6, 7, 8, 9, 10)

  • second alternative: explicitly specify references at the callsite.
    Leave the associations as-is from the Gist and change the second assert
    in
    the test to:

assert_equal 10,
Post.includes(:a_comments).references(:post_artifacts).flat_map(&:a_comments).count

This uses references to inform ActiveRecord that loading the requested
Posts also requires post_artifacts.

The generated SQL looks different than the previous case, as
references
forces eager-load instead of preload:

SELECT “posts”.“id” AS t0_r0, “posts”.“title” AS t0_r1, “comments”.“id”
AS
t1_r0, “comments”.“content” AS t1_r1 FROM “posts” LEFT OUTER JOIN
“post_artifacts” ON “post_artifacts”.“post_id” = “posts”.“id” AND
“post_artifacts”.“artifact_type” = ? LEFT OUTER JOIN “comments” ON
“comments”.“id” = “post_artifacts”.“artifact_id” AND
“post_artifacts”.“rule” = ? [[“artifact_type”, “Comment”], [“rule”, 0]]


One thing that doesn’t work yet: specifying references in the scope
passed to has_many.

–Matt J.

I will remember to post the sql output next time.

I tried using the references solution earlier but that was not an
acceptable solution because it created a really large query and crippled
my
performance.

After testing it. It seems you’re first proposal will do the job for me.

I was hoping for a solution that didnt involve multiple join
associations
because i will end up with quite a few of them since i have like five
different comment types.

Anyway thanks for your help Matt and Fred.

Flemming

Den torsdag den 14. januar 2016 kl. 15.44.24 UTC+1 skrev Matt J.: