Best way to model arbitrary-keyed Hash

Hi all,

I’ve got an object that looks a lot like a hash, and its valid
parameters are essentially arbitrary. I’m successfully storing these
objects, but now I need to be able to query them.

This is for my project Puppet[1], which manages many different classes
of objects, like files, services, packages, and users. Each of these
classes has its own set of valid parameters, and I’m adding new classes
and modifying existing classes all of the time, so I don’t want to
create a separate Rails table for each class. I also don’t want to
create a separate table for each class because I have a ‘host’ class
that functions as a collection of these objects, along with a bit of
other data (like its IP address); if I have a separate table for each
class, I’ll need a separate association for each one, also.

I’ve been storing them using an association, with the main info (class
and name) in one table, and the parameters in a separate table (e.g.,
the file’s owner, group, and mode) using associations.

However, I expect these queries won’t work well:

rails_parameters.name = ‘owner’ and rails_parameters.value = ‘root’ OR
rails_parameters.name = ‘owner’ and rails_parameters.value = ‘bin’

What is the best way to store these objects so that I can easily find
them? I’m assuming I can’t serialize the parameters and query against
them, since SQL has no clue about the serialized YAML.

Thanks,
Luke

1 - http://reductivelabs.com/projects/puppet


Progress isn’t made by early risers. It’s made by lazy men trying to
find easier ways to do something. --Robert Heinlein

Luke K. | http://reductivelabs.com | http://madstop.com

Hey Luke,

You might want to take a look at ferret and the acts_as_ferret plugin.
On
the surface it seems like a good fit for what you are trying to
accomplish.

ferret - http://ferret.davebalmain.com/trac
aaf - http://projects.jkraemer.net/acts_as_ferret/

Will Groppe

William G. wrote:

Hey Luke,

You might want to take a look at ferret and the acts_as_ferret plugin.
On the surface it seems like a good fit for what you are trying to
accomplish.

ferret - http://ferret.davebalmain.com/trac
aaf - http://projects.jkraemer.net/acts_as_ferret/

It’s good to know those are options, but I’d prefer something that maps
to the database a bit more directly.


I don’t know the key to success, but the key to failure is trying to
please everybody. – Bill Cosby

Luke K. | http://reductivelabs.com | http://madstop.com

On Tue, 03 Oct 2006 12:26:28 -0500
Luke K. [email protected] wrote:

and modifying existing classes all of the time, so I don’t want to
create a separate Rails table for each class. I also don’t want to
create a separate table for each class because I have a ‘host’ class
that functions as a collection of these objects, along with a bit of
other data (like its IP address); if I have a separate table for each
class, I’ll need a separate association for each one, also.

No way, you’re doing a Rails frontend to Puppet? Contact me off list
and I’ll help you out no problem.

I think there’s a few little Rails features you can use, but it sounds
like you’ve gotta get the data model worked out to match what Puppet
uses. Shouldn’t be that hard, but it is possible to get them all
organized into the database. It also might be that you have to
generalize what all of these things actually are and use attributes to
differentiate them.

Anyway, contact me since I’d love to help out on Puppet.


Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu

http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 – Come get help.

On 10/4/06, Luke K. [email protected] wrote:

It’s good to know those are options, but I’d prefer something that maps
to the database a bit more directly.


I don’t know the key to success, but the key to failure is trying to
please everybody. – Bill Cosby

Luke K. | http://reductivelabs.com | http://madstop.com

How about using Ferret instead of the database. Ferret basically
stores hash objects which would seem to fit well for your free form
classes. You could either create a new index for each class or store
all objects in a single index. Querying the index will obviously be
very simple.

Zed A. Shaw wrote:

No way, you’re doing a Rails frontend to Puppet? Contact me off list and I’ll help you out no problem.

Well, there’s already a limited Rails front-end to Puppet:

http://www.reductivelabs.com/projects/puppet/documentation/puppetshow.html

But this is actually only using ActiveRecord so I don’t have to deal
with SQL or db agnosticism.

I think there’s a few little Rails features you can use, but it sounds like you’ve gotta get the data model worked out to match what Puppet uses. Shouldn’t be that hard, but it is possible to get them all organized into the database. It also might be that you have to generalize what all of these things actually are and use attributes to differentiate them.

Yep, that’s exactly the trouble I’m having.

Every object has a type (e.g., file) and title (e.g., “/etc/passwd”), so
right now I’m just limiting querying to those two fields when using
Rails. I’d like to find a db model that would allow me to find any of
these objects by arbitrary attributes, rather than just these two.

Anyway, contact me since I’d love to help out on Puppet.

Will do; I’d love the help.


Love is a snowmobile racing across the tundra and then suddenly it
flips over, pinning you underneath. At night, the ice weasels come.
–Matt Groening

Luke K. | http://reductivelabs.com | http://madstop.com

David B. wrote:

How about using Ferret instead of the database. Ferret basically
stores hash objects which would seem to fit well for your free form
classes. You could either create a new index for each class or store
all objects in a single index. Querying the index will obviously be
very simple.

Does ferret to complex object queries? That is, could I do something
like search for ‘obj[:owner] == “root”’?

I assume that I can’t easily share a ferret index across multiple
servers; it’s critical that I retain the ability to have multiple
servers hit my database, both for performance and service availability.


The Number 1 Sign You Have Nothing to Do at Work…
The 4th Division of Paperclips has overrun the Pushpin Infantry
and General White-Out has called for a new skirmish.

Luke K. | http://reductivelabs.com | http://madstop.com

On 10/5/06, Luke K. [email protected] wrote:

like search for ‘obj[:owner] == “root”’?
Yes. The query would look like this: ‘owner:“root”’. But searching for
‘obj1[:owner] == obj2’ would be a little more difficult. Mind you, I
don’t think it would be any easier in a database.

I assume that I can’t easily share a ferret index across multiple
servers; it’s critical that I retain the ability to have multiple
servers hit my database, both for performance and service availability.

You could either use a NFS to access the index (you can have multiple
readers reading the index at one time) or set up a DRb server to the
index. Anyway, here is a random example. I don’t really know how this
would fit with how Puppet works but hopefully it gives you an idea how
Ferret works.

Cheers,
Dave

require 'rubygems'
require 'ferret'

obj1 = {:id => 1, :children => [3, 4], :size => "large"}
obj2 = {:id => 2, :children => [3, 4], :size => "small"}
obj3 = {:id => 3, :parents => [1, 2], :size => "large"}
obj4 = {:id => 4, :parents => [1, 2], :size => "small"}

index = Ferret::I.new

[obj1, obj2, obj3, obj4].each {|obj| index << obj}

index.search_each('size:small AND (parents:1 OR children:3)') do 

|id, score|
puts index[id].load.inspect
end

David B. wrote:

Yes. The query would look like this: ‘owner:“root”’. But searching for
‘obj1[:owner] == obj2’ would be a little more difficult. Mind you, I
don’t think it would be any easier in a database.

Ok.

You could either use a NFS to access the index (you can have multiple
readers reading the index at one time) or set up a DRb server to the
index. Anyway, here is a random example. I don’t really know how this
would fit with how Puppet works but hopefully it gives you an idea how
Ferret works.

Hmmm. I’ll have to look into this more; there’s already a thread this
week on puppet-dev about how important it is to be able to scale
horizontally, so I don’t want to complicate that any more than I need
to.

index = Ferret::I.new

[obj1, obj2, obj3, obj4].each {|obj| index << obj}

index.search_each('size:small AND (parents:1 OR children:3)') do |id, score|
  puts index[id].load.inspect
end

Okay, cool, thanks for the example. I’ll look into it more closely, now
that I know it will work for what I need.

Thanks!


Hoare’s Law of Large Problems:
Inside every large problem is a small problem struggling to get out.

Luke K. | http://reductivelabs.com | http://madstop.com