I have a “data warehouse” application I’ve been writing and I would like
to solicit some assistance with how to design it. Basically, I harvest
data from a lot of sources and dump them into my database.
For background on what I’m doing, this “data warehouse” takes
HR/Personnel/Company/Organizational data and associates that with User
Account and Group data from 5 platforms (ie, Active Directory, UNIX,
others).
I began first by scaffolding and designing the Models, and then
importing the data with a Rake task–roughly doing:
1.) Insert/Update nightly data into Table.
2.) Flag data that was not in my nightly source data (maybe it was
deleted?)
3.) Call method to create relationships (“hooks”) between these various
sources.
I wrote this in Perl (with mysql) and I displayed this data in a webpage
by using a set of SQL Views and a meta-table, describing what elements
of these views is displayed.
It was not a good solution for a lot of reasons so I decided to redo my
database and learn Ruby/RoR in the process.
Now I need to interact with this data in a lot of ways. The web portion
is only one aspect of this database’s use. As a result, I have a lot
more data here than the user really needs to see.
I want to have the initial page be a search form. Some columns maps
one-to-one with other columns in the database. Other search fields act
as a generic filter across all (or most) columns.
When they submit the search, it displays a subset of data / results,
which can then be clicked on and more detail is shown about
particular[/groups of] records.
All told, I have 17 tables (so, 17 nightly dumps). A majority of the
searches are going to be against 3 to 8 of these tables.
Given all of this, my issues/questions are as follows:
- I’m really trying to avoid hard coding as much as possible. Ideally,
I’d have some kind of admin interface to make changes to how data is
displayed. The idea of hard coding each text box for the search fields,
or the displayed columns, etc… isn’t appealing to me.
If this really is the best solution, I suppose I can go that route. The
nice thing of having my Views in MySQL was that it was easy to map meta
data to what to display.
-
I’m trying to figure out how to design this to make it very scalable
and easy to modify. Particularly, if two tables have a column that have
something in common, I’d like to (perhaps) make those into a link to
each other. -
As above, I’d also like to add extra functionality. I setup
acts_as_revisable in my database and it works very well. If a user has
an elevated role, I’d like to have them be able to review the history of
a record.
These are pretty lofty goals and this will take some baby steps. I’m
still new to Ruby/RoR, but I think it’s been hard (so far) to find good
documentation that addresses some of these issues. Or maybe I don’t
know the key words or how to approach this the Ruby Way (probably the
case)?
I’d like to use my experiences to do a write-up/tutorial on how to
address data warehousing problems with RoR (and starting RoR from
scratch). If I become good enough, I may eventually attempt a plugin
for combined data warehousing, versioning, cross-referencing, etc (if I
still think there’s a use for it).
I really appreciate everyone’s help/advice on the issue.