How well does Rails handle brownfield data in a greenfield application?

Let me give an example: Say I have an e-commerce storefront with data
that’s provided by a 3rd-party vendor, instead of just a handful of
items that I’m selling myself (which seems to be the way most Rails e-
commerce sites are set up, for example Shopify). My vendor has a list
of about 28,000 product SKUs which they provide to me as a huge tab
delimited flat-file containing product, manufacturer and category
information (i.e. not relational in any way, shape or form, just a
glorified spreadsheet), which I will need to parse several times to
extract the fields into different tables. This file has predefined
IDs for the primary key for each column, that needs to be used in
order to reference items properly (e.g. Product #12543 is made by
Manufacturer #17835 and belongs to Category #34324, based on the
columns “ProductID”, “ManufacturerID” and “CategoryID”). I can’t
deviate from this structure.

Would Rails have any potential issues with having its ID key
predetermined beforehand? In the database it can still be kept as an
autoincrementing integer, just the ID gets assigned prior - I’m going
to have to write a task of some kind to parse the file out into chunks
and then load the product data into the correct models using those
chunks of relevant data. If I was dealing with a database directly
this might cause some issues, but I’m not sure about Rails itself.

I want to make sure I won’t run into any pitfalls before I make an
attempt at this.

On Fri, May 22, 2009 at 8:33 AM, Wayne M.
[email protected]wrote:

IDs for the primary key for each column, that needs to be used in
chunks of relevant data. If I was dealing with a database directly
this might cause some issues, but I’m not sure about Rails itself.

I want to make sure I won’t run into any pitfalls before I make an
attempt at this.

I would recommend leaving the Rails record IDs intact and simply create
another field that represents the SKU. This could be a
non-autoincrement
field in your table but this really depends on how the vendor introduces
new products to the system. In short, your record ID should be
different
from your product ID (i.e. SKU) because this would give you a bit more
flexibility in the future in regards to change of the SKU. For example,
if
the
SKU, 12543, changed to one of the following:

CAT12543

or

ZIP12543

or

BIF12543

You get the idea.

Good luck,

-Conrad

The SKU would be a separate field in and of itself, what I’m saying is
the numeric ID (what Rails would set as an autoincrement) is pre-
assigned from the data instead of being assigned automatically from
Rails.

Be very careful doing this. When you change assumed Rails conventions
and behavior, it tends to bite you in unexpected ways. While you can
override the ID column, I’d suggest not doing it.

I’d seriously just let Rails add an extra ID column and put the big
table in other columns. The cost is small and you are not potentially
breaking future Rails stuff.

On Fri, May 22, 2009 at 10:32 AM, Wayne M.
[email protected]wrote:

The SKU would be a separate field in and of itself, what I’m saying is
the numeric ID (what Rails would set as an autoincrement) is pre-
assigned from the data instead of being assigned automatically from
Rails.

Again, I would leave the Rails ID intact to not invalidate the
conventions
set forth and create
others fields that are needed for your application. The ID used by
Rails is
used as record
identifier and shouldn’t be used for other purposes like an actual
product
ID, order ID, and so on.

Good luck,

-Conrad

Brendon Whateley wrote:

Be very careful doing this. When you change assumed Rails conventions
and behavior, it tends to bite you in unexpected ways. While you can
override the ID column, I’d suggest not doing it.

I’d seriously just let Rails add an extra ID column and put the big
table in other columns. The cost is small and you are not potentially
breaking future Rails stuff.

+1. Mapping legacy IDs onto default Rails IDs will bite.


Roderick van Domburg
http://www.nedforce.com