Generic extension of an ActiveRecord model

I’m about to get started on a Rails app and before getting started I’ve
been
thinking a bit about how I’m going to design the database. The app I’m
building will have a model object to represent a person. The person will
have a core set of attributes that identify them (such as first/last
name).
I also need the capability to associate arbitrary attributes with the
people
within the system. For different sets of people I’ll have different sets
of
attributes which are available. For example for one set of people I
might
have the following:

favorite programming language
favorite technical book
age

And for another set I might have:

age
language
height
weight

There are a few obvious ways to support this in the database. One way
would
be to have a person_attributes table that contains a set of name/value
pairs
associated with a person. Other options, which are less ideal due to the
restrictions they impose on querying, are storing the attributes as an
XMLblob or in some other format but stuffed in a single field.

My question is thisÂ…has anyone had to do something similar, and/or seen
a
plugin that will do something similar to what I’m after?

Cheers,
Steve

I would say go with the first option you mentioned, but keep in mind if
for
some reason the second option seems more usable to you, you can use the
‘acts_as_ferret’ plugin to easily add extremely quick full text
searching on
a column in your model/table… meaning you could use an XML blob fairly
easily and not restrict yourself that much on query ability.

Something else to potentially take a look at (in regards to the XML blob
kind of thing) is the serialize option in models… it can be used to
allow
you to have the column stored as text (YAML text, to be exact) in a
single
field in your table, but reference it as a Hash when talking to your
model.

This, combined with the acts_as_ferret plugin might allow you to find
the
mix of functionality/lean database that you want.

I don’t know of any off the top of my head, but I’d imagine someone else
on
the list has probably run into this problem before.
A lot of times, if an on-list thread starts being between just 2 people,
other people will begin ignoring it… you may want to re-post your new
question (about other apps that have had the same problem) as a new
message
to bring renewed interest to it.

Very interesting idea regarding the acts_as_ferret plugin. I definitely
wouldn’t have thought of that as an option but I think it makes a lot of
sense and could work out nicely. Of course I have no experience with
the
plugin, so I’ll have to check it out to see if it will support all my
needs.

Regarding my original option, do you know of any apps that do something
similar. I’ve been looking around, but, haven’t found anything. It
seems
like a pretty common problem that I would have thought others would have
run
into by now.

Thanks!

  • Steve

I’m about to get started on a Rails app and before getting started

I’ve been thinking a bit about how I’m going to design the database. The app
I’m building will have a model object to represent a person. The person will
have a core set of attributes that identify them (such as first/last name).
I also need the capability to associate arbitrary attributes with the people
within the system. For different sets of people I’ll have different sets of
attributes which are available. For example for one set of people I might
have the following:

Would single table inheritance not solve your problem?

It theoretically could solve his problem, but you would end up with a
table
with a ton of fields, only a few of which were being used by any given
record.

On Saturday 24 February 2007, Steve E. wrote:

I’m about to get started on a Rails app and before getting started
I’ve been thinking a bit about how I’m going to design the database.
The app I’m building will have a model object to represent a person.
The person will have a core set of attributes that identify them
(such as first/last name). I also need the capability to associate
arbitrary attributes with the people within the system. For different
sets of people I’ll have different sets of attributes which are
available. For example for one set of people I might have the
following:
[snip]

As someone who has been there before, ages before Rails, I suggest that
you have probably misled yourself into thinking that you need that much
genericity.

If you truly have arbitrary attributes in the database, you will need a
way to support them at the UI, too. How about business logic? can it
count on any attributes being there? Also, most likely you need to
impose some constraints on the attributes.

My personal experience – as someone who loves abstraction and
generality – is that the perceived need for great genericity often
originates from a lacking understanding of the requirements. Then,
genericity looks like a safety net.

In effect, I think that there’s good chance that a closer examination of
what you really need will show that there’s a manageable number of
variants each with fixed attributes.

There are a few obvious ways to support this in the database. One way
would be to have a person_attributes table that contains a set of
name/value pairs associated with a person.
[snip]

If all you need is store arbitrary(!) name/value pairs per person, then
such a table is fine. It gets troublesome when you need to access these
attributes in queries; be prepared for a significant performance hit
compared to ordinary columns. Also, let me reiterate, for queries over
these attributes to make any sense at all, they can’t be the amorphous
mass you make them out to be. There must be structure in there that can
be queried and that structure you can represent in your db schema.

Michael


Michael S.
mailto:[email protected]
http://www.schuerig.de/michael/

As someone who has been there before, ages before Rails, I suggest that
you have probably misled yourself into thinking that you need that much
genericity.

I’ve been there much before Rails too, although I’m not sure how that’s
relevant to the question at hand. For the application I’m building I do
need that much genericity which is precisely why I posted the question.

If you truly have arbitrary attributes in the database, you will need a

way to support them at the UI, too. How about business logic? can it
count on any attributes being there? Also, most likely you need to
impose some constraints on the attributes.

I appreciate your concern regarding the approach I’m taking, however, I
do
know what I need and I have thought about all of these concerns.

My personal experience – as someone who loves abstraction and

generality – is that the perceived need for great genericity often
originates from a lacking understanding of the requirements. Then,
genericity looks like a safety net.

Or perhaps it actually is needed?

In effect, I think that there’s good chance that a closer examination of

what you really need will show that there’s a manageable number of
variants each with fixed attributes.

That examination has been done, and there is in fact not a manageable
number
of variants.

If all you need is store arbitrary(!) name/value pairs per person, then

such a table is fine. It gets troublesome when you need to access these
attributes in queries; be prepared for a significant performance hit
compared to ordinary columns. Also, let me reiterate, for queries over
these attributes to make any sense at all, they can’t be the amorphous
mass you make them out to be. There must be structure in there that can
be queried and that structure you can represent in your db schema.

I’m assuming your referring to the option where they’re stored in the
database as a blob of data. As I stated in my earlier response I
haven’t
use acts_as_ferret so I don’t know how rich of a query experience it
provides. I do know from working with Lucene that there are ways that
you
can have it index XML and other structures, so I’m assuming that
acts_as_ferret has a similar capability. If not then you win.

Thanks,
Steve

Michael

On Tuesday 27 February 2007, Steve E. wrote:

haven’t use acts_as_ferret so I don’t know how rich of a query
experience it provides. I do know from working with Lucene that
there are ways that you can have it index XML and other structures,
so I’m assuming that acts_as_ferret has a similar capability. If not
then you win.

Oh, I didn’t know this was about winning. Anyway, if you really need all
the flexibility you can get, use an attributes table of name/value
pairs or consider using multiple of these tables, one for each value
type that you need. For constraints/validations I suggest a look at the
ActiveSpec plugin.

If the volume of data does not grow beyond what fits into main memory,
ActiveRecord with RDBMS backend may not be the best choice to begin
with. Madeleine (http://madeleine.rubyforge.org/) might be better
suited.

Michael


Michael S.
mailto:[email protected]
http://www.schuerig.de/michael/


Michael S.
mailto:[email protected]
Michael Schürig | Sentenced to making sense

I don’t think that he’s saying he needs to have arbitrary name/value
pairs… simply that there are several subclasses of Person, all of whom
have many different sub attributes that aren’t related… he’s wondering
on
the best way to deal with that. Basically, that STI would be ridiculous
because having 10 name/value pairs for each subtype that aren’t related
to
the other subtypes, you’d end up with a 100 field table that was only
using
15 fields per row… BAD design.

I could be wrong… he did say arbitrary values, but I think he only
meant
in the sense that it’s not a certain set of values for every single
variation of a person.

Oh, I didn’t know this was about winning.

It’s always about winning isn’t it? :wink:

For constraints/validations I suggest a look at the ActiveSpec plugin.

If the volume of data does not grow beyond what fits into main memory,
ActiveRecord with RDBMS backend may not be the best choice to begin
with. Madeleine (http://madeleine.rubyforge.org/) might be better
suited.

Thanks, I’ll take a look at ActiveSpec and Madeleine to see if they
might be
helpful.

Cheers,
Steve