Forum: Ruby on Rails Seeing performance problems with has_many :through relation

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Af1ee2208795752559b049502ded660c?d=identicon&s=25 Andrew Cowan (k2mia)
on 2006-04-29 19:39
Greetings all, I am looking for some insight from some of the more
experienced with has_many :through relationships as I am having a hard
time reconciling the performance I am seeing.

I have 2 tables that are linked together via has_many :through

Table 1: urls ( there are about 7,000 records )
2 fields, id and url

Table 2: phrases ( there are about 87,000 records )
2 fields, id and phrase (w/ an index defined on phrase)

A given url can be associated w/ many phrases and a given phrase can be
associated with many urls.

Table 3: relations ( there are about 235,000 records )
id, url_id, phrase_id and isrel (int)
w/ an index created on url_id,phrase_id

In Relation model I have
   belongs_to :url
   belongs_to :phrase

In Url model I have
   has_many :relations
   has_many :phrases, :through => :relations

In Phrase model I have
   has_many :relations
   has_many :urls, :through => :relations

For a given phrase I can access phrase.urls wthout any performance
issues, I see this in my Lighttpd console:

 Url Load (0.423328)   SELECT urls.* FROM urls INNER JOIN relations ON
urls.id = relations.url_id WHERE (relations.phrase_id = 12)


The problem is apparent when I try to access url.phrases for any url
object, my console screen fills up with these:

 Phrase Load (0.006682)   SELECT phrases.* FROM phrases INNER JOIN
relations ON phrases.id = relations.phrase_id WHERE (relations.url_id =
987)

one of those messages for each url, and there can be up to 7k urls for a
given scan.

I know little about optimizing MySQL databases and I am wondering if I
need to tweak the way the tables are structured or if its something I
can fine-tune in rails.

If it helps the indices were defined on the relations table as follows:

CREATE unique index phrase_url on relations (phrase_id, url_id)
CREATE unique index url_phrase on relations (url_id,phrase_id)

( started with just the first index and added the second when these
issues started popping up )

I am hoping someone can spot the problem and suggest a modification to
either the DB or the relationship in rails that leads to a solution.

Thanks for any suggestions!
Andy
Af1ee2208795752559b049502ded660c?d=identicon&s=25 Andrew Cowan (k2mia)
on 2006-04-30 19:39
Anyne have any ideas what may be wrong ( am hoping my post just slipped
under the radar of someone who may be able to help )...


 Greetings all, I am looking for some insight from some of the more
 experienced with has_many :through relationships as I am having a hard
 time reconciling the performance I am seeing.

 I have 2 tables that are linked together via has_many :through

 Table 1: urls ( there are about 7,000 records )
 2 fields, id and url

 Table 2: phrases ( there are about 87,000 records )
 2 fields, id and phrase (w/ an index defined on phrase)

 A given url can be associated w/ many phrases and a given phrase can be
 associated with many urls.

 Table 3: relations ( there are about 235,000 records )
 id, url_id, phrase_id and isrel (int)
 w/ an index created on url_id,phrase_id

 In Relation model I have
    belongs_to :url
    belongs_to :phrase

 In Url model I have
    has_many :relations
    has_many :phrases, :through => :relations

 In Phrase model I have
    has_many :relations
    has_many :urls, :through => :relations

 For a given phrase I can access phrase.urls wthout any performance
 issues, I see this in my Lighttpd console:

  Url Load (0.423328)   SELECT urls.* FROM urls INNER JOIN relations ON
 urls.id = relations.url_id WHERE (relations.phrase_id = 12)


 The problem is apparent when I try to access url.phrases for any url
 object, my console screen fills up with these:

  Phrase Load (0.006682)   SELECT phrases.* FROM phrases INNER JOIN
 relations ON phrases.id = relations.phrase_id WHERE (relations.url_id =
 987)

 one of those messages for each url, and there can be up to 7k urls for
a
 given scan.

 I know little about optimizing MySQL databases and I am wondering if I
 need to tweak the way the tables are structured or if its something I
 can fine-tune in rails.

 If it helps the indices were defined on the relations table as follows:

 CREATE unique index phrase_url on relations (phrase_id, url_id)
 CREATE unique index url_phrase on relations (url_id,phrase_id)

 ( started with just the first index and added the second when these
 issues started popping up )

 I am hoping someone can spot the problem and suggest a modification to
 either the DB or the relationship in rails that leads to a solution.

 Thanks for any suggestions!
 Andy
C9c7a65848f13e2b1a226bbe43bc3672?d=identicon&s=25 Pete Yandell (pete)
on 2006-05-03 18:56
(Received via mailing list)
Andrew,

So it sounds like you're doing a find to pull out a big list of urls,
and then calling url.phrases on each one?

If this is the case, you should be able to add ':include => :phrases'
to that initial find to make it all happen in one query.

Pete Yandell
http://9cays.com
This topic is locked and can not be replied to.