Large External Data Integration


I’m looking for best practices / ideas on how to refresh large external
data sources.

What if your rails app relies on a large data set that you get via some
service and it needs to be updated periodically. For instance, lets say
your rails app uses weather report and airline flight information. You
can get both data sets via some feed mechanism and you would like to use
data that is no more than 1 hour old.

One approach on this would be to simply clear and reload tables
containing this data on a periodic basis, but I would not like to
interrupt service. The next Idea I had was to have two sets of tables
and a switching mechanism to point to one side while the other is being

Any thoughts or ideas? I’m also looking at using the Cron Plugin
managing this.



Use a version marker in the affected tables.
store the current version in another table maybe an application settings
update with current version + 1
Once updated update current version.
On site only show current version.

On 3/3/06, Mike L. removed_email_address@domain.invalid wrote:

containing this data on a periodic basis, but I would not like to

Posted via

Rails mailing list

Never be afraid to try something new. Remember, amateurs built the
ark; professionals built the Titanic!