When is it okay to resort to find_by_sql?

When is it appropriate to resort to find_by_sql?

I’m all for portable, easily maintained code. I’m a HUGE fan the new
ActiveRecord::Relation model in Rails 3.0 I keenly feel the peril of
dropping into find_by_sql().

Question 1: is there’s a snowball’s chance in Hades for the following
code to be written WITHOUT resorting find_by_sql()?

inner_query =<<EOQ
SELECT DATE(st.datetime) AS start_time,
DATE(et.datetime - INTERVAL 1 DAY) AS end_time,
DATEDIFF(et.datetime, st.datetime) AS day_count,
sbs.quantity AS quantity,
sbs.cost AS cost
FROM service_bills as sbs
INNER JOIN time_dimensions AS st ON st.id = sbs.start_time_id
INNER JOIN time_dimensions AS et ON et.id = sbs.end_time_id
WHERE sbs.metered_service_id IN (#{metered_service_ids})
EOQ

I’m sure it’s possible using a ServiceBill.find() construct where each
clause is essentially a quoted string from the above, but by that point,
I’d be hard pressed to say that is clearer or more maintainable.

Question 2: Nested queries. In my system, inner_query is, in fact, an
inner query that appears within this little monster:

outer_query =<<EOQ
SELECT st.id AS start_time_id,
et.id AS end_time_id,
DATE(st.datetime) AS time,
SUM(bills.quantity/bills.day_count) AS daily_quantity,
SUM(bills.cost/bills.day_count) AS daily_cost
FROM time_dimensions AS st
INNER JOIN time_dimensions AS et ON DATE(st.datetime + INTERVAL 1 DAY)
= DATE(et.datetime)
INNER JOIN (#{inner_query}) AS bills
WHERE st.datetime BETWEEN bills.start_time AND bills.end_time
AND st.is_midnight = 1
GROUP BY st.id
EOQ

Even if both of these queries could be cast into a find() construct, I
don’t see how to create the equivalent of nested queries: it seems like
I’d have to “pop back up” to the Rails level, which is problematic for
really large datasets.

So: If ActiveRecord::Relation.find() really cannot handle queries of
this complexity, I’m okay with that. But if it can, I’d welcome
pointers or documentation on how to code them.

TIA.

  • ff

Fearless F. wrote:

When is it appropriate to resort to find_by_sql?

When you need something that can’t be done any other way.

I’m all for portable, easily maintained code. I’m a HUGE fan the new
ActiveRecord::Relation model in Rails 3.0 I keenly feel the peril of
dropping into find_by_sql().

Question 1: is there’s a snowball’s chance in Hades for the following
code to be written WITHOUT resorting find_by_sql()?

inner_query =<<EOQ
SELECT DATE(st.datetime) AS start_time,
DATE(et.datetime - INTERVAL 1 DAY) AS end_time,
DATEDIFF(et.datetime, st.datetime) AS day_count,
sbs.quantity AS quantity,
sbs.cost AS cost
FROM service_bills as sbs
INNER JOIN time_dimensions AS st ON st.id = sbs.start_time_id
INNER JOIN time_dimensions AS et ON et.id = sbs.end_time_id
WHERE sbs.metered_service_id IN (#{metered_service_ids})
EOQ

I’m sure it’s possible using a ServiceBill.find() construct where each
clause is essentially a quoted string from the above, but by that point,
I’d be hard pressed to say that is clearer or more maintainable.

If it is possible to use find instead of find_by_sql, then always do so.
If nothing else, ActiveRecord may be able to reason better about the
query string.

Question 2: Nested queries. In my system, inner_query is, in fact, an
inner query that appears within this little monster:

outer_query =<<EOQ
SELECT st.id AS start_time_id,
et.id AS end_time_id,
DATE(st.datetime) AS time,
SUM(bills.quantity/bills.day_count) AS daily_quantity,
SUM(bills.cost/bills.day_count) AS daily_cost
FROM time_dimensions AS st
INNER JOIN time_dimensions AS et ON DATE(st.datetime + INTERVAL 1 DAY)
= DATE(et.datetime)
INNER JOIN (#{inner_query}) AS bills
WHERE st.datetime BETWEEN bills.start_time AND bills.end_time
AND st.is_midnight = 1
GROUP BY st.id
EOQ

Even if both of these queries could be cast into a find() construct, I
don’t see how to create the equivalent of nested queries: it seems like
I’d have to “pop back up” to the Rails level, which is problematic for
really large datasets.

There are relatively few cases (not none, but few) where a subquery
should not be replaced by a join. Are you sure you need that subquery?

So: If ActiveRecord::Relation.find() really cannot handle queries of
this complexity, I’m okay with that. But if it can, I’d welcome
pointers or documentation on how to code them.

Can’t help there; haven’t started playing with Rails 3 yet.

TIA.

  • ff

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Marnen Laibow-Koser wrote (among other things…)

Question 2: Nested queries. In my system, inner_query is, in fact, an
inner query that appears within this little monster:

There are relatively few cases (not none, but few) where a subquery
should not be replaced by a join. Are you sure you need that subquery?

Well, then, Manen, help me understand if I need that subquery.

The inner_query construct fetches “monthly” utility bills. (I say
“monthly” in quotes because the start and end dates don’t generally fall
on month boundaries.) The outer_query construct “explodes” the monthly
bills into daily bills, dividing by the number of days spanned by the
bill, and summing multiple bills that fall on the same day (since they
can overlap). I’ve thought about this for a while, but don’t see how to
do both of those functions via a join.

I’ll sleep on it, but if you have any insights, let me know. Thanks.

  • ff

Fearless F. wrote:

Well, then, Manen, help me understand if I need that subquery.

Fiddlesticks. Marnen, pardon my typo (again) on the spelling of your
name.

  • Frealess Foole

Fearless F. wrote:

Marnen Laibow-Koser wrote (among other things…)

Question 2: Nested queries. In my system, inner_query is, in fact, an
inner query that appears within this little monster:

There are relatively few cases (not none, but few) where a subquery
should not be replaced by a join. Are you sure you need that subquery?

Well, then, Manen, help me understand if I need that subquery.

The inner_query construct fetches “monthly” utility bills. (I say
“monthly” in quotes because the start and end dates don’t generally fall
on month boundaries.) The outer_query construct “explodes” the monthly
bills into daily bills, dividing by the number of days spanned by the
bill, and summing multiple bills that fall on the same day (since they
can overlap). I’ve thought about this for a while, but don’t see how to
do both of those functions via a join.

I’ll have to think about it some more. However, a quick look at your
two queries suggest that you may be trying to do too much in the
database and too little in the application.

I’ll sleep on it, but if you have any insights, let me know. Thanks.

I’ll have to spend some more time on it.

  • ff

Best,

Marnen Laibow-Koser
http://www.marnen.org
[email protected]

Marnen Laibow-Koser wrote:

There are relatively few cases (not none, but few) where a subquery
should not be replaced by a join. Are you sure you need that subquery?

Marnen: I owe your a debt of gratitude. Your note spurred me to
really grok SQL joins. After thirty six hours of study and
anti-social behavior, I’ve got it. I think I’ll write a blog post
called “The Zen of SQL JOIN: a gentle guide for procedural programmers”.
(Teaser: Use JOINs to generate cartesian products of tables. Use ON,
WHERE, and HAVING to filter the combinatorial explosion to only those
rows that you want.)

And you are absolutely right: subqueries can (always? often?) be
replaced by studlier joins. In my system, I’ve re-cast what was
previously a four-level (!!) nested query with a “flat” join across
seven tables. And it now runs like a bat out of hell.

I STILL don’t know how to ask the ActiveRecord query system to alias a
table in a join, as in:

INNER JOIN vertices AS v1 ON v1.id = edge.v1_id
INNER JOIN vertices AS v2 ON v2.id = edge.v2_id

so I’m still using find_by_sql(). If someone can suggest a clean
ActiveRecord syntax for this kind of query, I’m all ears.

  • ff