Forum: Rails-core (closed, excessive spam) Eager loading sort order

Fd79850ca312fe682c9b5e4b21f3442b?d=identicon&s=25 bhus (Guest)
on 2008-04-29 18:52
(Received via mailing list)
Sorry, just thought I'd do a repost because of the junk mail that my
message got stuck in between. So, again.... I have tried to wade
through the various tickets on this topic and have come away confused.
Is the abandonment of an association's sort order during eager loading
something we just 'have to live with'? I have a wee patch that I've
concocted that honors an association's sort order when loaded eagerly.
I'm just wondering : has this been attempted before? Should I save my
breath?

/Brad.
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2008-04-29 19:27
(Received via mailing list)
On 29 Apr 2008, at 17:52, bhus wrote:

>
> Sorry, just thought I'd do a repost because of the junk mail that my
> message got stuck in between. So, again.... I have tried to wade
> through the various tickets on this topic and have come away confused.
> Is the abandonment of an association's sort order during eager loading
> something we just 'have to live with'? I have a wee patch that I've
> concocted that honors an association's sort order when loaded eagerly.
> I'm just wondering : has this been attempted before? Should I save my
> breath?

Depends what you are talking about. for the new eager loading
mechanism, this should already work. For the old eagerloading i
suspect that the base case (eager load one association) would be fine,
but once you start having multiple (possibly nested associations), all
with sort orders it would start to get really really hairy.

Fred
Fd79850ca312fe682c9b5e4b21f3442b?d=identicon&s=25 bhus (Guest)
on 2008-04-29 19:46
(Received via mailing list)
I'm sorry, but I'm not sure what you mean by 'new' vs 'old' eager
loading. The eager loading I mean took place when the :include option
was given on a find. I looked at the source and it did not seem to
apply ordering in this case.

Is there another way of doing eager loading? I would gladly change my
code if there is a better way.

As for my solution, I will have to put it through it's paces for sure,
but I believe my principals are sound. I just wanted to know if this
would be of any value to a broader audience.

/Brad.
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2008-04-29 20:15
(Received via mailing list)
On 29 Apr 2008, at 18:45, bhus wrote:

>
> I'm sorry, but I'm not sure what you mean by 'new' vs 'old' eager
> loading. The eager loading I mean took place when the :include option
> was given on a find. I looked at the source and it did not seem to
> apply ordering in this case.
Edge rails has a new eager loading scheme (have a look in
association_preload.rb). :include still falls back to the old
implementation in some cases (there's a quick summary at
http://www.spacevatican.org/2008/4/29/include-and-conditions
  )

Fred
Fd79850ca312fe682c9b5e4b21f3442b?d=identicon&s=25 bhus (Guest)
on 2008-04-29 20:37
(Received via mailing list)
Thanks so much. That's brilliant!
But I guess I need to be "on edge" to have this new eager loading
mechanism, eh? Not long I guess.
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2008-04-29 20:43
(Received via mailing list)
On 29 Apr 2008, at 19:36, bhus wrote:

>
> Thanks so much. That's brilliant!
> But I guess I need to be "on edge" to have this new eager loading
> mechanism, eh? Not long I guess.

Yup, you have to be on edge. It would still be interesting to see what
you've done, since we fall back to the old code in some places.

Fred
27c170f482104299af279902be0a9c26?d=identicon&s=25 Trevor Squires (Guest)
on 2008-04-29 21:38
(Received via mailing list)
Fred,

I thought you might be interested in this:

http://rails.lighthouseapp.com/projects/8994-ruby-...

Regards,
Trevor

On 4/29/08, Frederick Cheung <frederick.cheung@gmail.com> wrote:
> Edge rails has a new eager loading scheme (have a look in
>  >
>
--
--
Trevor Squires
http://somethinglearned.com
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2008-04-29 21:59
(Received via mailing list)
On 29 Apr 2008, at 20:37, Trevor Squires wrote:

>
> Fred,
>
> I thought you might be interested in this:
>
> http://rails.lighthouseapp.com/projects/8994-ruby-...
>
Funny, I was thinking about has_one just this afternoon. It's a
toughie though - while fiddling with some of the internals can get you
the right comment, you're still loading all comments. Not sure off the
top of my head how you get around that (what does the old :include do ?)

Fred
27c170f482104299af279902be0a9c26?d=identicon&s=25 Trevor Squires (Guest)
on 2008-04-29 23:54
(Received via mailing list)
On 4/29/08, Frederick Cheung <frederick.cheung@gmail.com> wrote:
>  >
>
> Funny, I was thinking about has_one just this afternoon. It's a
>  toughie though - while fiddling with some of the internals can get you
>  the right comment, you're still loading all comments. Not sure off the
>  top of my head how you get around that (what does the old :include do ?)
>
>  Fred
>

Hey Fred,

"what does the old :include do?" - it ignores your :order so it's
arbitrary as far as I'm concerned.

Ironically and by happy coincidence, using mysql and a has_one :order
where you want the most recent record tends to actually give you the
most recent record even though your :order clause is discarded.  Yes,
this means that fallback-eager is (probably) favoring the last record
in the set.

"you're still loading all comments" - yeah, new-eager and
fallback-eager both ask the database for rows that will be discarded.

What I'm bringing up in the ticket is that there are currently 3
different results, none of which are consistent with any others.

1 - Non eager is definitively correct.
2 - fallback-eager ignores :order so can't be trusted [*]
3 - new-eager does the opposite of #1 so is *never* correct

[*] sorta, if you are aware of the issues/limitations it might still
work out for you

Both eager styles discard records so the impact is the same - it's
just that new-eager is guaranteed to always be wrong.

So now that I've gotten that bit out of the way, you mentioned "still
loading all comments".

I've spent a bit of time on the resultset trimming issue for has_one
with :order and found:

If all databases supported a postgres-style "distinct on" operation
the solution would be trivial.  Of course, mysql doesn't have anything
like this.

The only query-based approach to trimming the results that I've found
to work reliably in mysql is to craft a rather heinous self-join:

select comments.* from comments
left join comments_prune on comments.post_id = comments_prune.post_id
and
comments.created_at < comments_prune.created_at
where comments_prune.id is null and
comments.post_id in (1,2,3,4,5)
order by comments.created_at desc

It would be possible to build such a query by parsing the :order
option and adding appropriate 'and' join conditions but it gets
quickly out of hand - imagine :order => 'foo asc, bar desc, wibble
asc'.  BLECH.

The most bulletproof solution when :order is present on a has_one is
to just issue 2 queries: first fetch the id and foreign_key ordered
appropriately, build a list of ids that discards
non-first-for-foreign_key ids and then fetch again using your pruned
id list.

But as I said, trimming the resultset isn't relevant to my ticket and
I have to admit I'm not sure how much value would be generated by
coding up trimming logic anyhow.

Regards,
Trevor
--
--
Trevor Squires
http://somethinglearned.com
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2008-04-30 01:26
(Received via mailing list)
On 29 Apr 2008, at 22:53, Trevor Squires wrote:

>>>
>> do ?)
> 2 - fallback-eager ignores :order so can't be trusted [*]
>
Yup, agree with you on all that. I was just wondering if we could be
smart about that, in addition to resolving the issue on the ticket

> select comments.* from comments
> asc'.  BLECH.
Yeah, ugh. If we grouped by post_id are we guaranteed anything
sensible (I seem to remember that what the db gives for non agregate,
not grouped on columns can't be relied upon to be anything specific
other than one of the rows but I can't remember).
> coding up trimming logic anyhow.
>
It would be nice to have in addition to what's in your ticket, it
seems non trivial though.

Fred
27c170f482104299af279902be0a9c26?d=identicon&s=25 Trevor Squires (Guest)
on 2008-04-30 02:00
(Received via mailing list)
On 4/29/08, Frederick Cheung <frederick.cheung@gmail.com> wrote:
> Yeah, ugh. If we grouped by post_id are we guaranteed anything
>  sensible (I seem to remember that what the db gives for non agregate,
>  not grouped on columns can't be relied upon to be anything specific
>  other than one of the rows but I can't remember).
>

The main hurdle I found with trying to use "group by" is that any
"order by" in your query is applied *after* the grouping so you can't
use "group by" to try and simulate "distinct on".

If anyone with some cross-database smarts wants to chime in with the
ultimate solution I'd love to see it.  It's a rare use-case but it
seriously bugs me that I can't find a good, cheap, cross-database
solution :-P

Trev

--
--
Trevor Squires
http://somethinglearned.com
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2008-04-30 02:16
(Received via mailing list)
On 30 Apr 2008, at 00:59, Trevor Squires wrote:

> use "group by" to try and simulate "distinct on".
Ah yes, of course. (the only way round that i know of is to order in a
subselect and then select from that and group, but then we're back in
the realm of fugliness).

Fred
Fd79850ca312fe682c9b5e4b21f3442b?d=identicon&s=25 bhus (Guest)
on 2008-05-01 20:44
(Received via mailing list)
OK. This all sounds quite overwhelming. But I decided to have a go at
solving my own problem and seeing if it is of any value to anyone
else.
So I dove into associations.rb and ended up generating an 'order'
chain in the JoinDependency when it is created. The way I did it, the
order chain will only reference those associations that specifically
had an 'order' clause given (as well as the primary keys of it's
parents in order to maintain 'hierarchy').
I then use that chain later on in
construct_finder_sql_for_association_limiting, inside a custom
add_order routine I created.
This routine basically first generates the standard ordering
statements given in the find, and then adds association ordering on
after (thereby giving precedence to the order statements from the call
to find).
It has served my purposes. Is there broader applicability? I am
willing to share code, although it's rough around the edges since I
haven't delved in here much. And it still needs to be tested. But it
does seem to work for the rinky-dink multiple associations and 1-level
nested associations I tried it on.

/Brad.
This topic is locked and can not be replied to.