Rinda::TupleBag


#1

Hello

In the current version of Ruby
(http://svn.ruby-lang.org/repos/ruby/trunk/lib/rinda/tuplespace.rb)
Rinda::TupleBag is implemented with hash by tuple size and then with
array.
This realization is simple and fast on very small amount of data, but
scales very bad (especially in searching).

How will it be rewritten? Is there any attempts to make faster the
standard version?


#2

Hello

In the current version of Ruby
(http://svn.ruby-lang.org/repos/ruby/trunk/lib/rinda/tuplespace.rb)
Rinda::TupleBag is implemented with hash by tuple size and then with
array.
This realization is simple and fast on very small amount of data (e.g.
on writing), but
scales very bad in reading and searching.

How can it be rewritten?
May be other realizations of tuplespaces are more scalable and fast?
Is there any attempts to make faster the standard version?


#3

On Oct 15, 2007, at 9:29 AM, Alex V. Breger wrote:

Is there any attempts to make faster the standard version?
I’m working on creating a faster tuple space implementation called
Marinda that is completely independent of Rinda, though it won’t be
ready for release in the near future. All the tuple space
functionality is implemented, and I’m using it in a production
setting, but the scalable tuple space matching algorithm isn’t
implemented (which is the main reason why I haven’t released it
yet). A scalable algorithm is challenging, but it should be more
doable with my implementation than in Rinda since I restrict tuples
to be values rather than allowing object references (via DRb) like
Rinda does. I’m currently reading the literature for some insights
on a suitable algorithm. Unfortunately, I have a number of things to
work on so implementing a scalable algorithm is lower down in my
priorities. Anyway, for details on the design, see my talk slides at

http://www.caida.org/publications/presentations/2007/
young_ark_syslunch/

–Young


#4

On Oct 15, 2007, at 09:29 , Alex V. Breger wrote:

In the current version of Ruby
(http://svn.ruby-lang.org/repos/ruby/trunk/lib/rinda/tuplespace.rb)
Rinda::TupleBag is implemented with hash by tuple size and then
with array.
This realization is simple and fast on very small amount of data (e.g.
on writing), but
scales very bad in reading and searching.

How can it be rewritten?

I have heard of attempts at backing TupleSpace with a database, but
I’ve not heard of any being completed.

May be other realizations of tuplespaces are more scalable and fast?
Is there any attempts to make faster the standard version?

In ruby, not to my knowledge.


#5

Shawn A. wrote:

I’ve thrown something together that uses KirbyBase (flat text ruby DBMS)
as
the back end. Let me know if that interests you.

This does NOT support transactions.

/Shawn

Sorry to disturb the entire list.

Shawn,

I’d like to see your implementation in KirbyBase.

I’m at: WayneFChin @ gmail DOT com

Thanks,

–Wayne


#6

I think we need a rewrite of standard rinda’s implementation, e.g.
with Hashes, not by arrays.
We need faster verison in ruby, because current version is too slow
even on not so big databases.


#7

I’ve thrown something together that uses KirbyBase (flat text ruby DBMS)
as
the back end. Let me know if that interests you.

This does NOT support transactions.

/Shawn


#8

Alex V. Breger wrote:

I think we need a rewrite of standard rinda’s implementation, e.g.
with Hashes, not by arrays.
We need faster verison in ruby, because current version is too slow
even on not so big databases.

Is this even possible? IIRC, rinda uses #=== to match each element of
the tuples. Hash won’t do that.


#9

On Oct 25, 2007, at 08:35 , Alex V. Breger wrote:

I think we need a rewrite of standard rinda’s implementation, e.g.
with Hashes, not by arrays.

Last I read the code, Rinda would work with either an Array or a Hash
as the underlying data structure. I seem to recall that anything
supporting #each would work, but I’m not sure.

We need faster verison in ruby, because current version is too slow
even on not so big databases.

Don’t rewrite, profile.


#10

On Oct 15, 2007, at 11:54 PM, Eric H. wrote:

I have heard of attempts at backing TupleSpace with a database, but
I’ve not heard of any being completed.

i’ve backed it by sqlite, but not all functionality works due to the
requirements of remote objects being connected in memory to other
host in the current impl - i hacked in ‘reconnect/unmarshal’
functionality but it’s not fully tested. it’s way faster for big
objects and persistent. ultimately i decided that a total rewrite
based on sqlite would be the way to go.

cheers.

a @ http://codeforpeople.com/


#11

@hash[size] ||= []
@hash[size].push(ary)

Hashed by sizes and array for storing tuples of one size

Can we use tree of hashes here instead of hash of arrays? It can be
useful for any hierarchical data pushed into TupleBag, if we have a
lot of tuples with same length (e.g. rdf or xml data).