Forum: Ruby on Rails Flexible data schema in a database?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
William G. (Guest)
on 2006-04-07 02:16
(Received via mailing list)
I figured I pose this to the list and see if any one can build a
better mousetrap.

The problem:

1. Store hierarchical data in a database, but without knowing the
hierarchy beforehand.
2. Allow the user to define there own hierarchy, and allow the system
to support multiple hierarchies.

Call it a cataloging system, we have an object and we want to add data
about that object to the system.

Now my thoughts on the problem:

It may be possible to achieve this with just 2 tables.

fields
-----------------------------
id - primary key
parent_id - tree like structure
name - string for the name of the field
hierarchy_id - id of the hierarchy this field belongs to
repeatable - can a record contain more then one of these fields?

data
-----------------------------
id - primary key
field_id - reference to the field to which this data belongs
hierarchy_id - id of the hierarchy this data belongs to
content - string for the data itself
object_id - the object that this data corresponds to

Now this is very simplistic but could certainly be expanded to support
many more options.

Has anyone run into this problem?  Have you devised a more elegant
solution? Have I just lost my mind, plain and simple?

-Will
Phillip H. (Guest)
on 2006-04-07 04:23
(Received via mailing list)
On 7/04/2006, at 10:15 AM, William G. wrote:

> [snip]
>
> Now this is very simplistic but could certainly be expanded to support
> many more options.
>
> Has anyone run into this problem?  Have you devised a more elegant
> solution? Have I just lost my mind, plain and simple?

There's no really elegant solution, I've been pondering it for a
while. The solution you propose works, but it's efficient from either
Rails or the DB point of view, storing all values as varchars removes
all possibility of sensible indexing.

My current solution revolves around dynamically manipulating tables.
There's slightly more ROR overhead with reloading schema details, but
not a huge amount. It looks somewhat like this:
  - Data fields
   - Contains all options, field type, display type, names

  - type_1234_data
   - Table created, dropped and manipulated by after_create/
after_update/after_delete actions on the DataField model. Contains
normal columns

In the ROR side
  - Override const_missing to check for type_\d+_data and generate a
model on the fly.
  - Have model generated discard all column data on each request. Easy.

So the RoR overheads are
  - Dynamic models
  - Reloading table details on each request
  - Can't use Rails' automatic foreign keys, can't do has_many :type_#
{self.id}_data, but can be worked around.

Advantages
  - Real table, easy to look up a row manually
  - Can index if one type searches on a certain column repeatedly

Disadvantages
  - Web user has to have access to change the DB

My current system is very postgres specific, but I'll post it if
anyone's interested.
--
Phillip H.
removed_email_address@domain.invalid
http://www.sitharus.com/
William G. (Guest)
on 2006-04-07 04:53
(Received via mailing list)
I'd love to have a look Phil.  The more options I can check out the
better.

-Will
William G. (Guest)
on 2006-04-07 05:02
(Received via mailing list)
> There's no really elegant solution, I've been pondering it for a
> while. The solution you propose works, but it's efficient from either
> Rails or the DB point of view, storing all values as varchars removes
> all possibility of sensible indexing.

Well, technically you don't need to store all the values as varchar.
In the field type
you could have data type specifier.  Then simply add columns for the
various data
types to the data table.

ie:

data
-----------------------------
id - primary key
field_id - reference to the field to which this data belongs
hierarchy_id - id of the hierarchy this data belongs to
string_content - string data
number_content - numeric data
date_content - date data
.
.
.
object_id - the object that this data corresponds to

It would be a parse table, much like aggregate table descripted in the
agile book, but indexing should be workable.

I'm leary of the dynamic table based approach for one reason.  I've
seen a couple of the schemas destined to be included in this system,
and at least 3 have 200+ fields.  That's a boat load of database
tables.  But maybe there is some way to whittle that down to something
more reasonable.

-Will
Phillip H. (Guest)
on 2006-04-07 05:57
(Received via mailing list)
On 7/04/2006, at 1:01 PM, William G. wrote:

>> There's no really elegant solution, I've been pondering it for a
>> while. The solution you propose works, but it's efficient from either
>> Rails or the DB point of view, storing all values as varchars removes
>> all possibility of sensible indexing.
>
> Well, technically you don't need to store all the values as varchar.
> In the field type
> you could have data type specifier.  Then simply add columns for the
> various data
> types to the data table.

That's a very good point, hadn't thought of that...

> I'm leary of the dynamic table based approach for one reason.  I've
> seen a couple of the schemas destined to be included in this system,
> and at least 3 have 200+ fields.  That's a boat load of database
> tables.  But maybe there is some way to whittle that down to something
> more reasonable.

The system I'm working on is essentially an address book, but with
customisable fields. A similar project I'm working on uses the two
table approach you suggested first, and with 2.5 million rows there's
no noticeable performance loss, though trying to order searches
numerically is an issue. My current system is somewhat of a hack,
I'll figure out an easy way to describe it.
Phillip H. (Guest)
on 2006-04-07 07:05
(Received via mailing list)
On 7/04/2006, at 12:53 PM, William G. wrote:

> I'd love to have a look Phil.  The more options I can check out the
> better.

It got quite long, so I blogged it rather than emailing it to the
list. http://www.sitharus.com/articles/2006/04/07/dynami...
ruby-on-rails
William G. (Guest)
on 2006-04-07 23:33
(Received via mailing list)
Thanks Phillip, very interesting indeed.
This topic is locked and can not be replied to.