Looking for a Fast Persistent Store

On 8/13/06, Bob H. [email protected] wrote:

I absolutely agree which is why I’ve implemented the filesystem route
first. There is no performance problem with it.

The fastest thing I ever wrote was a transactional persistent store
for a guaranteed-devlivery messaging system. It used a decidedly
space-wasting design, and profiled approximately 100 times faster than
the filesystem on all supported platforms (Windows, Linux, Solaris) at
the time it was written (about eight years ago), assuming enough I/O
and memory bandwidth was available to make the comparison meaningful.
Most of the speed came by taking advantage of the tunings in the
kernel’s VMM.

But I can’t imagine writing such a thing in Java. What does Perst do
to be so fast?

On Mon, 14 Aug 2006, Francis C. wrote:

On 8/13/06, Jon S. [email protected] wrote:

You are rebuilding many of the features of git.
http://git.or.cz/index.html

Write-once/read-many is the optimization profile for archival storage,
but not for a persistent session cache. You’re proving yet again that

A persistent session cache was actually my original motivation when I
started working on the code.

IOWA’s default way of managing sessions is via a cache. For a busy
server, this obviously imposes some memory constraints on just how many
sessions or for just how long one can keep sessions around.

So I created a bilevel cache that uses an in-memory LRU cache for the L1
cache, and when an element is expired from the L1 cache, it is inserted
into the L2 cache. If the L2 cache is a persistent store, then that
session can be retrieved again when required.

By making that persistent store itself an LRU cache, I can still let it
self manage its ultimate size or the ultimate age of elements within it,
I
maintain reasonably fast access to elements that have gone out to the
persistent store, I can potentially do things like have the server flush
it’s L1 cache to L2 and shut down. When it’s started again all of the
sessions which were available when it was shut down are still available.
Etc…

Kirk H.

On 8/13/06, Francis C. [email protected] wrote:

On 8/13/06, Jon S. [email protected] wrote:

You are rebuilding many of the features of git.
http://git.or.cz/index.html

Write-once/read-many is the optimization profile for archival storage,
but not for a persistent session cache. You’re proving yet again that
there are almost as many valid and useful approaches to this problem
as there are applications.

Active sessions would stay as files and generate a new key each time
they are changed. Once a day you would sweep them into a pack. You
will need to make some small modifications for the git code to do this
but all of the hard code is already written.

With git’s compression you will be able to store hundreds of millions
of session snapshots in a couple of gigs of disk. It would provide
interesting data for later analysis.

On 8/13/06, Jon S. [email protected] wrote:

Active sessions would stay as files and generate a new key each time
they are changed. Once a day you would sweep them into a pack. You
will need to make some small modifications for the git code to do this
but all of the hard code is already written.

With git’s compression you will be able to store hundreds of millions
of session snapshots in a couple of gigs of disk. It would provide
interesting data for later analysis.

Once again, you’re solving a rather different problem (or perhaps
you’re redefining the problem to fit your solution): permanently
storing ephemeral session caches for later analysis is interesting but
different from just having a session cache that survives process
crashes. Yes, the “hard work” is done, but if you didn’t need to do
the hard work in the first place, it’s wasteful. Again, this whole
subthread is about Kirk’s filesystem approach, which is attractive for
some applications (not including yours) because it’s dead easy and it
may be “fast enough.”

Bob H. wrote:

  • it is fast
    wrappers. I can’t find anything in the Ruby world, and I don’t care
    i’ve used
    it on several projects.

I’m already using that (version 0.5 – I can’t get to RAA right now for
some reason and there are no files on Rubyforge for fsdb so I don’t know
if there is a more recent version). In version 0.5 the transactions were
not sufficient I think (it would be nice if I was wrong).

I apologize for the lack of fsdb stuff on the RubyForge page. I’ve never
found a comfortable way to automate gem uploads, so I still use my old
scripts for building this page:

http://redshift.sourceforge.net/

A quick scan there shows that fsdb-0.5 is the latest.

It’s on my list to figure out how to automate the RubyForge dance (and I
think Ara or someone did this a while ago, so maybe it’s a solved
problem now).

Now, on to your question…

Transactions in fsdb can be nested (and the transactions can be on
different dbs)–is that sufficient? This may not be what you mean by
“single transaction”, though.

But, like PStore, FSDB isn’t very fast–it’s pure ruby, it marshals
objects (by default–but you can tell fsdb to write the value strings
directly rather than via marshal), and it pays the cost of thread- and
process-safety. One advantage over pstore is finer granularity.

Anyway, I hope your search is fruitful, whether it lands at bdb or
sqlite or …

On 8/14/06, Francis C. [email protected] wrote:

On 8/14/06, Joel VanderWerf [email protected] wrote:

I apologize for the lack of fsdb stuff on the RubyForge page. I’ve never
found a comfortable way to automate gem uploads, so I still use my old
scripts for building this page:
Austin Z. has automated the “Rubyforge dance”- look at
PDF::Writer or post to him on this list.

Actually, the Net::LDAP Rakefile is the best thing to look at right now.
:wink:

-austin

On 8/14/06, Joel VanderWerf [email protected] wrote:

I apologize for the lack of fsdb stuff on the RubyForge page. I’ve
never
found a comfortable way to automate gem uploads, so I still use my old
scripts for building this page:

Austin Z. has automated the “Rubyforge dance”- look at
PDF::Writer or post to him on this list.

On 8/14/06, Austin Z. [email protected] wrote:

Actually, the Net::LDAP Rakefile is the best thing to look at right now. :wink:

-austin

And he passes the proverbial buck! :slight_smile:

Austin contributed all the code in Net::LDAP for doing the Rubyforge
and Gmail dances, and it works very well, but if you have trouble with
it, it’s probably because I munged it some after Austin checked it in.
So any problems will be my fault, not his.

There are some dependencies, like minitar.

Bob H. [email protected] writes:

That’s a good point. I don’t think I want to involve Java in this if
I can help it. The fellow who wrote Perst, Konstantin Knizhnik
http://www.garret.ru/~knizhnik/databases.html has also written
GOODS, FastDB, and GigaBASE. If I was going to write a C extension
I’d go with one of those. Konstantin has also written something
called DyBASE which is specifically for Dynamic languages, like Ruby
and Python, and comes with bindings for Ruby 1.6.x. I’ve asked
Konstantin about the state of DyBASE and am trying to work out if
that is worth updating to Ruby 1.8.4

I’ve just discovered RScheme’s PStore,
http://www.rscheme.org/rs/a/2005/persistence/

“a system that allows objects in persistent storage (i.e., on disk)
to be mapped into memory for direct manipulation by an application
program. The approach is based on the concept of pointer swizzling
at page-fault time as described in Paul Wilson’s Pointer Swizzling
at Page-Fault Time.”

Now, having that in Ruby would rock very much, but I’ve no idea on how
to implement it. Any couraged Ruby hacker? :slight_smile:

Austin Z. wrote:

On 8/14/06, Francis C. [email protected] wrote:

On 8/14/06, Joel VanderWerf [email protected] wrote:

I apologize for the lack of fsdb stuff on the RubyForge page. I’ve
never
found a comfortable way to automate gem uploads, so I still use my old
scripts for building this page:
Austin Z. has automated the “Rubyforge dance”- look at
PDF::Writer or post to him on this list.

Actually, the Net::LDAP Rakefile is the best thing to look at right now. :wink:

I will look at both before I start dancing. Thanks to both of you :slight_smile:

Just FYI, I still haven’t finished packing things up into a separate
release, but I do have some comparative benchmarks.

One a very modest server (AMD Athlon 2600 with generic IDE drives,
extfs3, on a Linux 2.4 kernel) that does have a small load on it, with
the
DiskCache, which implements overhead to maintain a linked list of files
on
disk in order to support LRU semantics, it averages around 500-600
writes
per second, 700-800 updates per second, 850-950 reads per second, and
1000
deletes per second.

Removing all of the LRU overhead results in a dramatic speedup. Writes
experience the least speedup, going to around 2000 per second, give or
take a couple hundred depending on other activity. Updates move into
the
3800-4200 range. Reads move into the 6000-6200 range, and deleted move
to
around 2000/second.

Given the hardware, I found those numbers pleasingly high. And it
clearly
demonstrates that unless one really needs the LRU capabilities in order
to
limit the total number of entries to some arbitrary value, there is much
to be gained by using the basic DiskStore, and just having an external
job
run periodically to flush old elements from the disk.

Kirk H.