Seeing performance problems with has_many :through relation


#1

Greetings all, I am looking for some insight from some of the more
experienced with has_many :through relationships as I am having a hard
time reconciling the performance I am seeing.

I have 2 tables that are linked together via has_many :through

Table 1: urls ( there are about 7,000 records )
2 fields, id and url

Table 2: phrases ( there are about 87,000 records )
2 fields, id and phrase (w/ an index defined on phrase)

A given url can be associated w/ many phrases and a given phrase can be
associated with many urls.

Table 3: relations ( there are about 235,000 records )
id, url_id, phrase_id and isrel (int)
w/ an index created on url_id,phrase_id

In Relation model I have
belongs_to :url
belongs_to :phrase

In Url model I have
has_many :relations
has_many :phrases, :through => :relations

In Phrase model I have
has_many :relations
has_many :urls, :through => :relations

For a given phrase I can access phrase.urls wthout any performance
issues, I see this in my Lighttpd console:

Url Load (0.423328) SELECT urls.* FROM urls INNER JOIN relations ON
urls.id = relations.url_id WHERE (relations.phrase_id = 12)

The problem is apparent when I try to access url.phrases for any url
object, my console screen fills up with these:

Phrase Load (0.006682) SELECT phrases.* FROM phrases INNER JOIN
relations ON phrases.id = relations.phrase_id WHERE (relations.url_id =
987)

one of those messages for each url, and there can be up to 7k urls for a
given scan.

I know little about optimizing MySQL databases and I am wondering if I
need to tweak the way the tables are structured or if its something I
can fine-tune in rails.

If it helps the indices were defined on the relations table as follows:

CREATE unique index phrase_url on relations (phrase_id, url_id)
CREATE unique index url_phrase on relations (url_id,phrase_id)

( started with just the first index and added the second when these
issues started popping up )

I am hoping someone can spot the problem and suggest a modification to
either the DB or the relationship in rails that leads to a solution.

Thanks for any suggestions!
Andy


#2

Anyne have any ideas what may be wrong ( am hoping my post just slipped
under the radar of someone who may be able to help )…

Greetings all, I am looking for some insight from some of the more
experienced with has_many :through relationships as I am having a hard
time reconciling the performance I am seeing.

I have 2 tables that are linked together via has_many :through

Table 1: urls ( there are about 7,000 records )
2 fields, id and url

Table 2: phrases ( there are about 87,000 records )
2 fields, id and phrase (w/ an index defined on phrase)

A given url can be associated w/ many phrases and a given phrase can be
associated with many urls.

Table 3: relations ( there are about 235,000 records )
id, url_id, phrase_id and isrel (int)
w/ an index created on url_id,phrase_id

In Relation model I have
belongs_to :url
belongs_to :phrase

In Url model I have
has_many :relations
has_many :phrases, :through => :relations

In Phrase model I have
has_many :relations
has_many :urls, :through => :relations

For a given phrase I can access phrase.urls wthout any performance
issues, I see this in my Lighttpd console:

Url Load (0.423328) SELECT urls.* FROM urls INNER JOIN relations ON
urls.id = relations.url_id WHERE (relations.phrase_id = 12)

The problem is apparent when I try to access url.phrases for any url
object, my console screen fills up with these:

Phrase Load (0.006682) SELECT phrases.* FROM phrases INNER JOIN
relations ON phrases.id = relations.phrase_id WHERE (relations.url_id =
987)

one of those messages for each url, and there can be up to 7k urls for
a
given scan.

I know little about optimizing MySQL databases and I am wondering if I
need to tweak the way the tables are structured or if its something I
can fine-tune in rails.

If it helps the indices were defined on the relations table as follows:

CREATE unique index phrase_url on relations (phrase_id, url_id)
CREATE unique index url_phrase on relations (url_id,phrase_id)

( started with just the first index and added the second when these
issues started popping up )

I am hoping someone can spot the problem and suggest a modification to
either the DB or the relationship in rails that leads to a solution.

Thanks for any suggestions!
Andy


#3

Andrew,

So it sounds like you’re doing a find to pull out a big list of urls,
and then calling url.phrases on each one?

If this is the case, you should be able to add ‘:include => :phrases’
to that initial find to make it all happen in one query.

Pete Y.
http://9cays.com