Merging two heavily referenced records into one


#1

Here’s the situation: I have a “people” table whose “id” column serves
as a foreign key to several other tables (authorships, editorships,
members, users, etc.). Every now and then I will find two distinct
records in the “people” table that represent the same human being. They
generally differ by small differences in spelling but should really be
the same record.

In these situations, I want a elegant solution to merge the two records
into one, and have all the foreign keys in the other tables use the new
id. I suppose I could keep record A, attach everything B had to A and
then remove B.

What kind of help can Rails provide ?

Ben


#2

On Tue, Jan 24, 2006 at 01:09:31AM -0500, Benoit G. wrote:
} Here’s the situation: I have a “people” table whose “id” column serves
} as a foreign key to several other tables (authorships, editorships,
} members, users, etc.). Every now and then I will find two distinct
} records in the “people” table that represent the same human being.
They
} generally differ by small differences in spelling but should really be
} the same record.
}
} In these situations, I want a elegant solution to merge the two
records
} into one, and have all the foreign keys in the other tables use the
new
} id. I suppose I could keep record A, attach everything B had to A and
} then remove B.
}
} What kind of help can Rails provide ?

This is a data problem and, as such, should be handled by the database.
Rails isn’t going to be much help. I’d recommend a stored procedure that
takes the “primary” record id (i.e. the one you want to keep) and the
duplicate record id as arguments, then begins a transaction, updates all
the foreign keys in the dependent tables, and deletes the duplicate
record.
Something like (SQL pseudocode):

CREATE PROCEDURE Merge_People_Records(primary int, duplicate int)
AS (
BEGIN
UPDATE DependentTable1 SET person_id = primary
WHERE person_id = duplicate;

– …

DELETE FROM People
WHERE person_id = duplicate;

COMMIT
);

If you have proper foreign key constraints (you do like referential
integrity, don’t you?), this will either succeed in its entirety or let
you
know that you’ve missed something.

To do the same things through ActiveRecord would be a pain. The only
remaining question is whether you want to execute this stored procedure
from your Rails app or whether it’s just as easy to do it from your
RDBMS’s
commandline app.

} Ben
–Greg


#3

On 2006-01-24 07:46:47 -0500, Gregory S.
removed_email_address@domain.invalid said:

had to A and } then remove B.
AS (
BEGIN
UPDATE DependentTable1 SET person_id = primary
WHERE person_id = duplicate;

– …

DELETE FROM People
WHERE person_id = duplicate;
COMMIT
);

I like the idea of a stored procedure :slight_smile: How well is this supported in
mySQL ? Could Rails call this procedure from an ActiveRecord object
(borrowing its connection) ?

If you have proper foreign key constraints (you do like referential
integrity, don’t you?), this will either succeed in its entirety or let you
know that you’ve missed something.

I have many constraints and integrity checks :slight_smile: I actually had to drop
a few in order to use acts_as_list. Rails would fail to update the
“position” column because of composite UNIQUE constaints.

Thanks for your input

Ben


#4

On 1/23/06, Benoit G. removed_email_address@domain.invalid wrote:

then remove B.

What kind of help can Rails provide ?

The Scaffolding Extensions plugin
(http://wiki.rubyonrails.org/rails/pages/Scaffolding+Extensions+Plugin)
has a merge records feature as part of its default scaffolding.
Assuming relationships to related tables are defined with has_one,
has_many, and has_and_belongs_to_many associations, you’d just create
the scaffold in the controller (scaffold :person) and go to the
merge_person page.