Complete index rebuild using AAF trunk

I am using AAF trunk, and I want a way to rebuild an index on a
production site with little or no interruption to service. The Drb
Server documentation* states that when an index is rebuilt, it is
done in a separate location and then swapped into place when
finished, and so to do a complete rebuild on a live site, one must
take into consideration objects which have been created or modified
in the meantime. To achieve this, I have come up with the following
solution:

http://pastie.textmate.org/66602

[1] Does this look like a complete solution? I suppose it relies on
timestamp consistency between system components… it is possible
that between setting “start = …” and performing the rebuild,
another thread in the system will have create an earlier timestamp
for an object that did not get committed until after the rebuild
began. Is it possible to do a perfect rebuild, or would that require
building a layer of concurrency logic into AAF?

[2] Is the behavior described in the Drb Server documentation
different from AAF when not using the Drb Server?

Thanks,
John

[1] Does this look like a complete solution? I suppose it relies on
timestamp consistency between system components… it is possible
that between setting “start = …” and performing the rebuild,
another thread in the system will have create an earlier timestamp
for an object that did not get committed until after the rebuild
began. Is it possible to do a perfect rebuild, or would that require
building a layer of concurrency logic into AAF?

You can sync your server clocks using ntpd, and you can always update
a few extra seconds to work around latency.

-Kyle

Jens, any thoughts on this?

On Fri, Jun 08, 2007 at 06:13:51AM -0400, John B. wrote:

yeah, that’s ok, I still didn’t catch up with the list :wink:

Jens, any thoughts on this?

see below.

in the meantime. To achieve this, I have come up with the following
building a layer of concurrency logic into AAF?
The scenario you describe might happen and cause a record not to be
indexed, but I’d implement it just like you did.

To be safe you can subtract a minute or so from your recorded start
time :wink:

If it is really critical for you to have all records indexed and relying
on the timestamps is a no-go you’ll have to implement your own
synchronisation mechanism, maybe with checking for a running rebuild on
each index update, and recording the corresponding records somewhere for
later indexing.

[2] Is the behavior described in the Drb Server documentation
different from AAF when not using the Drb Server?

Without the DRb server aaf won’t use index versions but will re-build
the index in place. I didn’t introduce the versioning there because the
usual non-DRb-scenarios (test cases and development system) don’t
require it. With non-DRb-Multi-Process-Scenarios it would be hard to
implement anyway.

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Hi!

On Fri, Jan 25, 2008 at 07:23:06PM -0500, John B. wrote:

Hey folks.

Here’s an update to my Super Duper Ferret Single Index Rebuild that
we were discussing back in June.

[…]

I’ve come up with this rake task:

http://pastie.textmate.org/private/4xyk2o0obibzi2tmpbog

Jens, what do you think? Anyone have any improvements to offer?

looks great. Mind if I add this as an example to acts_as_ferret?

Cheers,
Jens


Jens Krämer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database

Hey folks.

Here’s an update to my Super Duper Ferret Single Index Rebuild that
we were discussing back in June.

On Jun 8, 2007, at 7:54 AM, Jens K. wrote:

The scenario you describe might happen and cause a record not to be
indexed, but I’d implement it just like you did.

To be safe you can subtract a minute or so from your recorded start
time :wink:

I’ve come up with this rake task:

http://pastie.textmate.org/private/4xyk2o0obibzi2tmpbog

Jens, what do you think? Anyone have any improvements to offer?

Cheers,
John