ActiveRecord/Arel delete with joins?

Hello,

What’s the best/easiest way to write this delete statement using
ActiveRecord 3.1?

DELETE e1
FROM events e1
JOIN events e2
WHERE e1.subject_type = e2.subject_type
AND e1.subject_id = e2.subject_id
AND e1.origin_type = e2.origin_type
AND e1.origin_id = e2.origin_id
AND e1.id > e2.id

Thanks for the help.

On 28 October 2011 22:46, Christopher J. Bottaro [email protected]
wrote:

AND e1.origin_id = e2.origin_id
AND e1.id > e2.id

Not answering the question I am afraid, but I think it is unwise to
assume anything about the id sequence. Presumably here you are
assuming that id values are assigned in an increasing sequence, but I
don’t think this is necessarily guaranteed in the general case. I
think it might be better to use created_at, if that is what you really
mean. On the other hand if in reality you do not care which one you
delete and have the id test only to make sure that you delete only one
of them then please ignore my comment.

Since you are interested in the best way to code it (rather than just
hacking in the sql) then presumably it is something that happens
routimnely rather than some tidying up operation that you have to do
once. Would it not be possible using validations or similar to ensure
that the duplicate record situation does not happen in the first
place?

Colin

On Sat, Oct 29, 2011 at 4:31 AM, Colin L. [email protected]
wrote:

AND e1.subject_id = e2.subject_id
delete and have the id test only to make sure that you delete only one
of them then please ignore my comment.

Since you are interested in the best way to code it (rather than just
hacking in the sql) then presumably it is something that happens
routimnely rather than some tidying up operation that you have to do
once. Would it not be possible using validations or similar to ensure
that the duplicate record situation does not happen in the first
place?

Colin

Hmm. Interesting. On the one hand, I’m glad people are looking out for
each other and advice is given on best practices. On the other hand, I
forgot what it’s like to ask for help on the internet and have
everything
you do under heavy scrutiny :slight_smile:

Presumption incorrect. It is a one off and not routine code, but that
doesn’t stop me from wanting to learn how to better use AR/Arel. Also,
I
created a unique index as soon as I realized there were dupes and I
cleaned
them out. I also added validation (which isn’t guaranteed to work,
hence
the unique index in the db), and test cases/specs for the situation.

About the id vs created_at I disagree and consciously chose the former.
I
think either will work fine and it’s ok to make assumptions about the
uniqueness (for sure) and order (comfortably sure) of primary keys for a
given adapter. I’m familiar with the Postgres and MySQL adapters and I
know
they create unique, auto incrementing primary keys for each table.

That said, I am open to notion that I’m wrong or do not fully understand
something though, so

Thanks for the help,
– C

On Oct 29, 1:31am, Colin L. [email protected] wrote:

AND e1.subject_id = e2.subject_id
delete and have the id test only to make sure that you delete only one
of them then please ignore my comment.

Since you are interested in the best way to code it (rather than just
hacking in the sql) then presumably it is something that happens
routimnely rather than some tidying up operation that you have to do
once. Would it not be possible using validations or similar to ensure
that the duplicate record situation does not happen in the first
place?

All of this is true, and you should try these things before the
following.

You don’t need a join. You’re defining all your conditions on one
table just fine:

Event.where(:subject_type => e1.subject_type, … ).where(‘id > ?’,
e1.id).destroy_all

On Oct 29, 3:26pm, Kurt W. [email protected] wrote:

On Oct 29, 1:31am, Colin L. [email protected] wrote:
You don’t need a join. You’re defining all your conditions on one
table just fine:

Event.where(:subject_type => e1.subject_type, … ).where(‘id > ?’,
e1.id).destroy_all

This doesn’t do the same thing as the original query. This deletes all
the events which have the same subject_type etc as a specific event
e1, but created after e1, i.e. remove what you might consider
duplicates of the specific event e1. The original query on the other
hand deletes all duplicates, not only those that are duplicates of a
specific even.

Personally I would just use the raw sql in a call to delete_all. A
complicated use of arel isn’t necessarily “better” or easier to use
than sql (portability concerns aside, but then the whole idea that you
can move any significantly sized app just by changing a line in
database.yml is a bit of a myth anyway).

you could do something like

class Event < ..
  def self.duplicate_events
    event_alias = Event.arel_table.alias
    scoped.joins(event_alias).where(:subject_id =>

event_alias[:subject_id],
…).where(Event.arel_table[:id].gt(event_alias[:id]))
end
end

which gives you an Event.duplicate_events scope

unfortunately (at least with the version of ActiveRecord/Arel in rails
3.0.x you can’t do Event.duplicate_events.delete_all, because the bit
of Arel that deals with deletes always does “delete from blah” and
doesn’t allow you to say “delete t1 from t1 join t2, …”

Fred

On 29 October 2011 18:23, Christopher J. Bottaro
[email protected] wrote:

JOIN events e2
think it might be better to use created_at, if that is what you really

Colin

Hmm. Interesting. On the one hand, I’m glad people are looking out for
each other and advice is given on best practices. On the other hand, I
forgot what it’s like to ask for help on the internet and have everything
you do under heavy scrutiny :slight_smile:

I did start by apologising for not answering the question :slight_smile:
I suspect it may be that for maybe 50% of questions asked here the
best result for the OP is not to have his question directly answered
but to suggest alternative ways of approaching the problem. No I have
not done the research to prove that, it is just my feeling. To answer
a question by suggesting alternatives is therefore a perfectly valid,
and often helpful response.

Presumption incorrect. It is a one off and not routine code, but that
doesn’t stop me from wanting to learn how to better use AR/Arel. Also, I
created a unique index as soon as I realized there were dupes and I cleaned
them out. I also added validation (which isn’t guaranteed to work, hence
the unique index in the db), and test cases/specs for the situation.

OK, I did not realise that this was an academic question. In that
case my suggestion is of no use to you. You never know, it may be of
use to someone else finding this thread in the future, in which case I
have not entirely wasted my time.

About the id vs created_at I disagree and consciously chose the former. I
think either will work fine and it’s ok to make assumptions about the
uniqueness (for sure) and order (comfortably sure) of primary keys for a
given adapter. I’m familiar with the Postgres and MySQL adapters and I know
they create unique, auto incrementing primary keys for each table.

Certainly the id values will be unique, there is no question about
that. I seem to remember reading about the situation with multiple
servers where each server will get given a batch of id values it could
use, so that the id values would not necessarily be in the same order
as created_at. I may be mistaken however. Also consider the
possibility in a few years time of someone migrating the code onto a
different db adaptor. The code might then break.
I think the point is that Rails does not guarantee that id values will
increase monotonically and therefore it is not a good idea to rely on
this

That said, I am open to notion that I’m wrong or do not fully understand
something though, so
Thanks for the help,

I don’t think I have been much help. I can answer part of the
question though. You ask for the “best/easiest” way to write the
statement. The easiest way is just to code in the sql as you
already have the sql available, so it is easy. I still don’t know
whether there is any better way though. It is certainly not obvious
to me, sorry again.

Colin

On Sun, Oct 30, 2011 at 6:24 AM, Frederick C. <
[email protected]> wrote:

Personally I would just use the raw sql in a call to delete_all. A

   scoped.joins(event_alias).where(:subject_id =>

event_alias[:subject_id],
…).where(Event.arel_table[:id].gt(event_alias[:id]))
end
end

That results in the following error:

RuntimeError: unknown class: Arel::Nodes::TableAlias

Distilled down:

Event.joins(Event.arel_table.alias)

Any ideas?

And yeah it seems like just putting SQL into a delete_all call is the
easiest way to go. Still interested in this Arel stuff though.

– C