Is It Possible to Re-Work This for Better Performance?

I have four AR classes:

Artists (has many Releases)
Releases (has many ReleaseTracks)
Release Tracks (has many AudioTracks)
Audio Tracks

They are all also connected in the other direction (i.e. belongs_to).

My page displays audio tracks, grouped by release track, then release
and
then by artist.

Right now I fetch the artists and simply use the defined relationships
to
display everything:

Loop through Artists

  • Display Releases
    ** Display Release Tracks
    *** Display Audio Tracks

This works but performance is terrible. With several thousand audio
tracks
and release tracks connected to hundreds of releases and dozens of
artists,
thousands of queries are processed each time this page loads.

AR is selecting the releases for each artist, the tracks for each
release
and the audio for each track, when many of the artists and releases have
no
audio information.

Can anyone think of how I can optimize this?

Obviously, I could simply select the audio tracks but then I’m not sure
how
I would build the display so that artists and releases are grouped
correctly
in their hierarchy.

Any help is appreciated. Thanks much.

Cheers.

Instead of performing thousands of queries, perform one:
write the quad-join SQL yourself and pull all the data
down in one burst. Also, leave off any db sorting and
instead perform any sorting after the data is pulled.

Hi, Hunter,

seems as if you haven’t got any answers yet. So here are some thoughts:

  1. You want to display AudioTracks: Then you write something about
    several ‘thousand audio tracks’. Hopefully you don’t mean to load all of
    them for one page everytime? If so: Have a look at pagination, this will
    limit the audio tracks to let’s say 10 and obviously will decrease the
    number of necessary associations that need to be fetched.
  2. That said: Have you got indexes on all of your foreign_keys? At if
    already the database work is terribly slow than you might want to put
    some work in optimizing your index-structure.
  3. Think about your database design: Is it really necessary to traverse
    all these relations (from AudioTracks to Release Tracks to Releases to
    Artists). For me it seems reasonable that a AudioTrack belongs as well
    as to a ReleaseTrack to a artist and so on). If you want to display
    AudioTracks than things should centralise around this AudioTrack (as
    well as around Releases if you want to display them)
  4. Finally: Eager Loading: Right now you are fetching relations as you
    are iterating over AudioTracks. You might as well prefetch this
    relations on the serverside: Have a look at
    http://api.rubyonrails.com/classes/ActiveRecord/Associations/ClassMethods.html
    (search for Eager loading of associations)
  5. And post finally: Have a look at the sophisticated caching mechanisms
    that Rails provides out of the box. If you are doing this right than you
    wouldn’t need any database contact after you’ve fetched the data at the
    first request and unless you haven’t updated any data (after that you
    have to recreate the cache obviously)

Regards
Jan

Lou V. wrote:

Instead of performing thousands of queries, perform one:
write the quad-join SQL yourself and pull all the data
down in one burst. Also, leave off any db sorting and
instead perform any sorting after the data is pulled.

Why would you sort the data in ruby instead of in the DB? Whenever I
have to go down to SQL I always sort the data there. Is there some
reason not to?


Jack C.
[email protected]

On 12/8/05, Jack C. [email protected] wrote:

reason not to?
Generally, it’s a good idea to keep business logic in Ruby if
possible. Only reason I can think of, anyways.

On 12/8/05, Harm de Laat [email protected] wrote:

Why would you sort the data in ruby instead of in the DB? Whenever I
have to go down to SQL I always sort the data there. Is there some
reason not to?

Generally, it’s a good idea to keep business logic in Ruby if
possible. Only reason I can think of, anyways.

It is probably faster to let the database do the sorting.
The DB engine is mostlikely better optimized to do this kind of thing.

It’s probably faster to write everything in highly-optimized assembly,
but that doesn’t mean it’s a good idea. Speed isn’t everything.

It is probably faster to let the database do the sorting.
The DB engine is mostlikely better optimized to do this kind of thing.

Jack C. wrote:

reason not to?
for lightly-loaded dbs sorting on the db server is sufficient.
for heavily-loaded dbs sorting on the server does not scale,
and it turns out to be faster in many cases when it is done
either in the app server or on the end-user’s machine.
it’s just an optimization that one would do if they were having
efficiency problems, which the original poster says he has.
The 3 most expensive things you can do to slow an (otherwise optimized)
db:

  • thrash it with unnecessary queries
  • joins
  • sorts
    This is based on my experience. Yours may be different.

I never said it was better :wink:
However, in this case, sorting is typicaly something you might leave up
to
the database.
Especially if you are dealing with large collections. In-memory sorting
can
take a lot of time and consume a lot of resources.
So in general, I would suggest to let the database do all the sorting
unless
this is impossible for some other (technical) reason.