Nested hash with arrays for default value

#1

I’m trying to find a “nice” way to make a nested hash with an empty
array as
the default “leaf” value.

Basically I’d like to be able to make an assignment as follows:

data[2][3][4][5] << 3

I can get close but I can’t get it right. The data is going to be
coming
straight out of a log so I can’t really build the hash ahead of time.


“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

#2

Glen H. wrote:

I’m trying to find a “nice” way to make a nested hash with an empty
array as
the default “leaf” value.

Basically I’d like to be able to make an assignment as follows:

data[2][3][4][5] << 3

I can get close but I can’t get it right. The data is going to be
coming
straight out of a log so I can’t really build the hash ahead of time.


“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

An empty array?
Well… You can try this:
http://trevoke.net/blog/2009/11/06/auto-vivifying-hashes-in-ruby/
As indicated, I didn’t come up with this, and it’ll take care of
creating the hashes for you. You can probably do a check : if nil, then
create array… Then add to array.

#3

On Mon, Jan 25, 2010 at 6:48 PM, Glen H. removed_email_address@domain.invalid
wrote:


“Hey brother Christian with your high and mighty errand, Your actions speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

I’ve not tested this too much, but what I tried was to setup a proxy
object that would insert a hash if the [] method is called on it, or
an array if the << method was called:

class ProxyDefault
def initialize hash, key
@hash = hash
@key = key
end

def
@hash[@key] = Hash.new {|hash,key| ProxyDefault.new(hash, key)}
@hash[@key][key]
end

def << value
@hash[@key] = []
@hash[@key] << value
end
end

h = Hash.new {|hash,value| ProxyDefault.new(hash, value)}

h[1][2][3] << “value”

p h
p h[1][2][3]

/temp$ ruby nested_hash_array.rb
{1=>{2=>{3=>[“value”]}}}
[“value”]

Hope this helps,

Jesus.

#4

On Jan 25, 2010, at 2:24 PM, Glen H. wrote:

I’ll play around with your solution. I have the following:

data = Hash.new { |l, k| l[k] = Hash.new { |l, k| l[k] = Hash.new { |l, k|
l[k] = Hash.new([]) }}}

I’m assuming you want ‘infinite’ depth. Consider:

default = lambda { |h,k| h[k] = Hash.new(&default) }
top = Hash.new(&default)

Gary W.

#5

2010/1/25 Jesús Gabriel y Galán removed_email_address@domain.invalid

I can get close but I can’t get it right. The data is going to be coming
I’ve not tested this too much, but what I tried was to setup a proxy
@hash[@key] = Hash.new {|hash,key| ProxyDefault.new(hash,
h = Hash.new {|hash,value| ProxyDefault.new(hash, value)}
Hope this helps,

Jesus.

Thanks Jesus,

I’ll play around with your solution. I have the following:

data = Hash.new { |l, k| l[k] = Hash.new { |l, k| l[k] = Hash.new { |l,
k|
l[k] = Hash.new([]) }}}

Believe me I know it’s ugly and not in any way flexible. Plus it
behaves in
ways which make me uncomfortable when I try to print the contents.


“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

#6

2010/1/25 Jesús Gabriel y Galán removed_email_address@domain.invalid

I’m assuming you want ‘infinite’ depth. Consider:

default = lambda { |h,k| h[k] = Hash.new(&default) }
top = Hash.new(&default)

The problem is that he wants the leaves of the hash to be arrays, and
not hashes.

Jesus.

Exactly, infinite depth would be nice as it would make a more temporally
portable solution. The proxy looks to be working great. I am a bit
confused as to why the << method in the proxy doesn’t overwrite a leaf
with
a new array though. I’m not complaining as it works the way I want it
to,
I’m just perplexed.

Thanks Jesus.


“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

#7

On Mon, Jan 25, 2010 at 10:28 PM, Gary W. removed_email_address@domain.invalid wrote:

default = lambda { |h,k| h[k] = Hash.new(&default) }
top = Hash.new(&default)

The problem is that he wants the leaves of the hash to be arrays, and
not hashes.

Jesus.

#8

On Jan 25, 2010, at 4:49 PM, Glen H. wrote:

2010/1/25 Jesús Gabriel y Galán removed_email_address@domain.invalid

The problem is that he wants the leaves of the hash to be arrays, and
not hashes.

Exactly, infinite depth would be nice as it would make a more temporally
portable solution. The proxy looks to be working great. I am a bit
confused as to why the << method in the proxy doesn’t overwrite a leaf with
a new array though. I’m not complaining as it works the way I want it to,
I’m just perplexed.

Oops. Sorry for the confusion. The trick with the proxy is that the
first time << is called on the proxy, it replaces itself with an empty
array. Further lookups will return the array and not the original
proxy.

Gary W.

#9

2010/1/26 Jesús Gabriel y Galán removed_email_address@domain.invalid

The proxy looks to be working great. I am a bit
h[1][2][3] will return that array and no proxy objects anymore. It
@hash = hash
@hash[@key][key]
end
the hash is: {} when calling << on the proxy object
the hash is: {2=>[“value”]} after replacing the proxy with an array
{1=>{2=>[“value”]}}
[“value”]

Jesus.

Sorry, I should have been more specific when stating my confusion. I am
confused as to why appending a second item into a leaf results in a
multi-item array rather than a new array with only the second item.

data[1][2][3] << 4
data[1][2][3] << 5

yields
{1=>{2=>{3=>[4,5]}}}
looing at the proxy I was expecting
{1=>{2=>{3=>[5]}}}

The behavior I’m seeing is what I want I just didn’t expect it. From
the
code it looks like << assigns an array to the key then appends a value.
I
was expecting that to overwrite the array created with the first << call
at
that level with a new single item array.


“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

#10

On Mon, Jan 25, 2010 at 10:49 PM, Glen H. removed_email_address@domain.invalid
wrote:

Exactly, infinite depth would be nice as it would make a more temporally
portable solution.

With it, you have infinite depth, until in a branch you decide to stop
by appending (<<) a value.
Then you fix the depth of that branch.

The proxy looks to be working great. I am a bit
confused as to why the << method in the proxy doesn’t overwrite a leaf with
a new array though. I’m not complaining as it works the way I want it to,
I’m just perplexed.

When you access h[1][2][3], a proxy object is inserted in the hash for
that key. The proxy object remembers the hash and the key. When you
call << on the proxy object, it replaces itself in the hash with an
empty array, to which it appends the value. So further calls to
h[1][2][3] will return that array and no proxy objects anymore. It
works the same for the upper levels: calling h[1] inserts a proxy in
the hash. When you call [] on it (for example h[1][2]) it replaces
h[1] with a hash.

Maybe this clarifies a bit more:

/temp$ cat nested_hash_array.rb && ruby nested_hash_array.rb
class ProxyDefault
def initialize hash, key
@hash = hash
@key = key
end

def
puts “the hash is: #{@hash.inspect} when calling [] on the proxy
object”
@hash[@key] = Hash.new {|hash,key| ProxyDefault.new(hash, key)}
puts “the hash is: #{@hash.inspect} after replacing the proxy with a
hash”
@hash[@key][key]
end

def << value
puts “the hash is: #{@hash.inspect} when calling << on the proxy
object”
@hash[@key] = [value]
puts “the hash is: #{@hash.inspect} after replacing the proxy with
an array”
@hash[@key]
end
end

h = Hash.new {|hash,value| ProxyDefault.new(hash, value)}

h[1][2] << “value”

p h
p h[1][2]

the hash is: {} when calling [] on the proxy object
the hash is: {1=>{}} after replacing the proxy with a hash
the hash is: {} when calling << on the proxy object
the hash is: {2=>[“value”]} after replacing the proxy with an array
{1=>{2=>[“value”]}}
[“value”]

Jesus.

#11

On Tue, Jan 26, 2010 at 3:29 PM, Glen H. removed_email_address@domain.invalid
wrote:

Then you fix the depth of that branch.
call << on the proxy object, it replaces itself in the hash with an
class ProxyDefault
puts "the hash is: #{@hash.inspect} after replacing the
@hash[@key]
the hash is: {} when calling [] on the proxy object
confused as to why appending a second item into a leaf results in a
The behavior I’m seeing is what I want I just didn’t expect it. From the
code it looks like << assigns an array to the key then appends a value. I
was expecting that to overwrite the array created with the first << call at
that level with a new single item array.

I understood your question, so this means I explained myself really
badly :-).
When you do this:

h[1] << 4

The following things happen:

  • The method [] of h is called with parameter 1
  • The hash detects that there’s no entry for that key, and so calls
    the default proc
  • The default proc inserts a Proxy object in the hash for that key
    (this proxy object remembers the hash and the key)
  • The result of the default proc (which is the proxy object itself) is
    returned
  • The method << with parameter 4 is called on the proxy object
  • That method removes the proxy object from the hash and replaces
    itself with an array with element 4 inside.

From now on, every time you call h[1] there is actually a value in the
hash, which is the array created by the proxy, and so the hash doesn’t
call the default proc anymore, and no other proxy object is involved.
Subsequent calls to h[1] << some_value will actually call the <<
method of the array.

Hope this clears up the issue a little bit more.

Jesus.

#12

2010/1/26 Jesús Gabriel y Galán removed_email_address@domain.invalid

the

           puts "the hash is: #{@hash.inspect} when calling << on

[“value”]

that level with a new single item array.

  • The hash detects that there’s no entry for that key, and so calls
    hash, which is the array created by the proxy, and so the hash doesn’t
    call the default proc anymore, and no other proxy object is involved.
    Subsequent calls to h[1] << some_value will actually call the <<
    method of the array.

Hope this clears up the issue a little bit more.

Jesus.

No, you didn’t acutally. I just wasn’t thinking about it properly.
When I
actually think about your explanation further the whole no more proxy
object
makes perfect sense and answers my question.

Thanks for your patience and help Jesus. It is appreciated.


“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

#13

Since ~2014, you can also use the XKeys Gem.

require 'xkeys'

data = {}.extend XKeys::hash
data[2, 3, 4, 5, :[]] = 3 # :[] is the next array index
data[2, 3, 4, 5, :[]] = 4
# {2=>{3=>{4=>{5=>[3, 4]}}}}
data[1] # nil
data[1, :else => []] # []
# Note: data[int1, int2] will try to array slice
# Use data[1, 2, {}] (empty option hash) for data[1][2]