Persistent Hash

kyle_s · May 9, 2008, 11:23pm

I’m looking for some sort of persistent hash, or object store for ruby.

I just need some advice about what works well & doesn’t, so I don’t
attempt to *re-invent the wheel.

I’d be using it for a some small client-server apps, maybe web front
end, maybe not, but either way, I’m guessing multiple processes will
want to access the data simultaneously.

The options I’ve come up with are
1 DBM/SDBM + YAML
2 PStore
3 Marshall
4 ActiveRecord

1 I really didn’t see much in the way of DBM/SDBM documentation, but
may be I was looking in the wrong places. All the same, I can see
pretty easily how I could do something like a persistent hash by
combining yaml and DBM, but it does make me wonder, has anyone already
done something like that?

2 PStore seems a little clunky, but rather effective: How is it with
medium sized data sets (hashes of a few thousand elements)? How is
over the longer term, data resiliency wise?

3 Marshalling and un-marshalling I know will work, but at the same
time it seems… well a bit prosaic.

4 ActiveRecord I know will work, and I’ve played with before, but I
really don’t like how it forces you to hammer your data structure into
it’s mold.

*yes re-inventing the wheel is a time honored and important past-time
for programmers. It’s fun, it teaches you things, but when you really
want to work on another part of a project, you have to force yourself
not to.

kyle_s · May 10, 2008, 12:12am

DataMapper [ http://datamapper.org/ ] and Sequel [
http://sequel.rubyforge.org/ ] might be options for you. I haven’t used
them
(yet); I just know that they’re there.

Regards,
Craig

kyle_s · May 10, 2008, 12:36am

DataMapper +1

Sent from my iPhone

On May 9, 2008, at 6:12 PM, “Craig D.”

kyle_s · May 10, 2008, 3:24am

On Fri, May 9, 2008 at 10:22 PM, Kyle S. [email protected]
wrote:

I’m looking for some sort of persistent hash, or object store for ruby.

I quite like fsdb - FSDB. It persists
data in the filesystem, is multi-thread and multi-process safe (as far
as flock is safe) and works nicely with yaml.

From the docs:

require ‘fsdb’

db = FSDB::Database.new(‘/tmp/my-data’)

db[‘recent-movies/myself’] = [“LOTR II”, “Austin Powers”]
puts db[‘recent-movies/myself’][0] # ==> “LOTR II”

db.edit ‘recent-movies/myself’ do |list|
list << “A la recherche du temps perdu”
end

Not sure if it hasn’t been updated because it’s stable or because
development has stopped. I used it for a note-taking application which
I had running for a couple of years. Never had a problem with it
(though I wasn’t exactly stressing it).

Regards,
Sean

kyle_s · May 10, 2008, 10:31am

On 09.05.2008 23:22, Kyle S. wrote:

2 PStore seems a little clunky, but rather effective: How is it with
medium sized data sets (hashes of a few thousand elements)? How is
over the longer term, data resiliency wise?

Might be an option, depending on how long your transactions are. If
they are short you might get away with doing a file lock on the store to
ensure concurrency safety. If transactions are longer or load is higher
you better use a relational database.

3 Marshalling and un-marshalling I know will work, but at the same
time it seems… well a bit prosaic.

Marshal isn’t really an option as it requires too much work around it to
make it concurrency safe IMHO. PStore is basically Marshal with
transactional behavior added.

You do not mention DBI, i.e. writing your SQL yourself. If your
application has to cope with significant concurrency (i.e. high number
of concurrent requests and / or long transactions) a RDBMS will give you
the best results IMHO because it contains mechanisms that deal with
this. Note that you can for example use the free version of Oracle or
Postgres, MySQL etc.

Kind regards

robert

kyle_s · May 10, 2008, 7:35pm

Sean O’Halpin wrote:

development has stopped. I used it for a note-taking application which
I had running for a couple of years. Never had a problem with it
(though I wasn’t exactly stressing it).

Regards,
Sean

It’s stable, AFAIK. It’s running as part of a couple of my projects and
I just sort of forget it’s there.

Whether it is a good fit depends on the requirements. OP mentioned:
small apps, multiple processes accessing the database. That sounds like
fsdb. In this case the main advantage over PStore will be breaking up “a
few thousand elements” into separate files. PStore uses one big file,
FSDB uses the file system itself. You can decide what granularity you
want–how much gets dumped into files vs. how many files/dirs. And
you’re free to choose the serialization method for each file (typically
based on filename extension)–yaml, marshal, or something else. YAML is
a bit slower and more limited than Marshal, but you end up with a
database that can be edited with a text editor.

There are limitations, of course. For example, there’s no automatic way
to manage references between objects stored in separate files.

Kirbybase is something else that’s often mentioned in these discussions,
but I don’t know it well…

kyle_s · May 11, 2008, 3:08am

ara.t.howard wrote:

On May 10, 2008, at 11:34 AM, Joel VanderWerf wrote:

It’s stable, AFAIK. It’s running as part of a couple of my projects
and I just sort of forget it’s there.

i think this project is one of the most under-rated out there - you
should push it on the list a bit as i’ve personally found it really
useful in the past and am always suprised more people don’t know about it.

Er, you mean like making a gem out of it… yeah I guess so

kyle_s · May 11, 2008, 12:48am

On May 10, 2008, at 11:34 AM, Joel VanderWerf wrote:

It’s stable, AFAIK. It’s running as part of a couple of my projects
and I just sort of forget it’s there.

i think this project is one of the most under-rated out there - you
should push it on the list a bit as i’ve personally found it really
useful in the past and am always suprised more people don’t know about
it.

cheers.

a @ http://codeforpeople.com/