Dbase query performance vs. iterating over large result arra

amin1426 · July 24, 2007, 11:31pm

Hello Folks. Please excuse my beginner’s ignorance …

My question regards database query performance vs. iterating over nested
results in an array. In the example below, a Project has many Tasks.
Tasks have a due date. When I display Projects, I would like to also
display the cumulative delay for all open Project Tasks based on their
due date, and today’s date.

Something like …

Project | Delay
XYZ | 5 days (4 tasks late, 1.25 day average delay)
… 100 records

As I learn RoR, I’ve discovered a couple ways I can do this. I can
perform the calculations in a single query. I can use the :include
option to build an array with nested children, and iterate over each
Project’s Tasks with something like parent.child.each.do. Or, finally, I
can issue 100 queries to the database, once for each record. Each has
advantages, but which is the most commonly accepted method in the RoR
community?

Thanks in advance

John

Building a website is a piece of cake.
Yahoo! Small Business gives you all the tools to get online.

amin1426 · July 25, 2007, 9:37am

That’s one of the instances where I’d have a stored procedure running
on the database server, and call it from Rails. I figure all this
sort of work should be done completely within the database server, and
not tying up my Rails server.

I’m sure you’ll get other perspectives on this, but that’s my 2c worth.

Dave M.

amin1426 · July 25, 2007, 9:37am

On 7/24/07, John C. [email protected] wrote:

Project | Delay
XYZ | 5 days (4 tasks late, 1.25 day average delay)
… 100 records

As I learn RoR, I’ve discovered a couple ways I can do this. I can perform
the calculations in a single query. I can use the :include option to build
an array with nested children, and iterate over each Project’s Tasks with
something like parent.child.each.do. Or, finally, I can issue 100 queries to
the database, once for each record. Each has advantages, but which is the
most commonly accepted method in the RoR community?

I’m hard pressed to see the advantages of issuing a query for each
child instead of using a single join. That’s what databases are for.

Chris

amin1426 · July 25, 2007, 6:53pm

A couple of comments…

Stored procedures are a very _un_rails approach. No offense
intended to the author, but Rails is in a world that recognizes that
sprocs are another aspect of application programming and it’s
(opinionated) answer is to avoid them in favor of consolidating the
code base into one repository/language/etc. Feel free to use them
just make sure you have somehow integrated them and recognize them as
a full-fledged part of the code base.
Another alternative to both the sproc and the query that includes
the tasks would be to do three optimized queries:
1. Obtain the Project:
  project = Project.find(params[:id])
2. Calculate the last due date for the tasks on the project:
  last_due_date = project.tasks.maximum(:due_date)
3. Calculate the number of tasks:
  task_count = project.tasks.count
  I think that gives you enough info to render the view you’re
  interested in and you avoid sprocs and excessive queries. You also
  avoid the potential delays of having the calculations (and objects)
  performed in memory.

HTH,
AndyV

On Jul 25, 12:04 pm, Jean N. [email protected]

amin1426 · July 25, 2007, 6:04pm

snacktime wrote:

I’m hard pressed to see the advantages of issuing a query for each
child instead of using a single join. That’s what databases are for.

Chris

/agreed

CRUD functionality I always use the ActiveRecord stuff, but complex
queries I drop the ORM layer and use queries right into the database.
Either by issuing a find_by_sql query or an execute command for updates
and such.

The hard fact is 1 query > multiple queries in an iteration ANY day.
(With respect to network IO, database load, table / row locking etc
etc)

Hope this helps.