Ruby Koans regarding Hashes

aris · December 28, 2012, 6:27am

I am trying to understand this, so let me know how I do. I know
this is probably a breeze for you veterans, but I need to put this in
words. heh

I have a C++/Perl background, so I do understand “associative arrays”,
but understanding Ruby hash objects, and the methods and operations that
can be used with them is pretty new to me.

In the “Ruby Koans” file, “about_hashes.rb”, I completed this method.
It was just a little tough to understand, initially, but I think I got
it. Here’s the test method with helpful line numbers:

def test_default_value_is_the_same_object

hash = Hash.new([])
hash[:one] << “uno”
hash[:two] << “dos”
assert_equal [“uno”, “dos”], hash[:one]
assert_equal [“uno”, “dos”], hash[:two]
assert_equal [“uno”, “dos”], hash[:three]
assert_equal true, hash[:one].object_id == hash[:two].object_id
end

Here is my breakdown:

a new hash is created, utilizing an empty array as the default value
for any key created without a value, and assigned to “hash”. This
wasn’t too bad, I just had to look up: Hash.new()
using indexing, the string, “uno”, is appended to the default key
value for the key, :one, which was initially an empty array and is now
the value [“uno”]. Since the “<<” was used instead of the assignment
operator “=”, it is an append, rather than an assign. For example, if
the statement was "hash[:one] = “uno”, then that replaces the default
array and the value is now a String
same as 2., but what I found interesting was how a new key was used,
and the append still modified the general default value for the entire
hash! So, now the key value default is now: [“uno”, “dos”]

4 - 6 test cases to prove that a default key value was created, with
lines 2 and 3 appending to it, and because these were not assignment
expressions, the keys tested (:one, :two, :three) are using the modified
default value.

Final proof that is the same default value, or same Array object_id,
being used for the two utilized keys.

So, when you create a Hash object, you can provide a default value. In
this case, the default value is an Array object. Because of this being
an Array object, the “<<” method can be used to modify the default
value. Using indexing on the Hash, this can be tested by querying the
hash:

hash = Hash.new([])
=> {}
hash[:foo]
=> []
hash[:foo] << “hello”
=> [“hello”]
hash[:foo]
=> [“hello”]
hash.inspect
=> {}

The last line “hash.inspect” is what I also find interesting, because if
there are default values, why is it still empty? I then assign a value,
overriding the default value:

hash[:foo] = “bar”
=> “bar”
hash.inspect
=> “{:foo=>“bar”}”

Now, inspect no longer shows an empty hash. Hmmm…

Ruby hashes are an interesting topic and I look forward to your veteran
insight on this topic.

Thanks!

derr1ck · December 28, 2012, 6:47am

Well seems that if you don’t assing an explicit value to a hash key then
nothing happens.
When you assing the default value of the Hash in it initialize method(
Hash.new(“default value”) you have to think of it as an object, you can
change it, of course, it is an object, and it’s a value that’s returned
when you call for a key that doesn’t has an expecific value.

hash = Hash.new(“go fish”)
hash[:a_key_that_does_not_have_assigned_an_explicit_value].capitalize!
puts hash[:another_one]
#=> “Go fish”
hash[:third_with_no_explicit_value].upcase!
puts hash[:another_more]
#=> “GO FISH”
hash[:another_more] = “something”
puts hash[:another_more]
#=> “something”

Well don’t know it help, cheers!

derr1ck · December 28, 2012, 9:15am

Derrick B. wrote in post #1090445:

hash = Hash.new([])
=> {}
hash[:foo]
=> []
hash[:foo] << “hello”
=> [“hello”]
hash[:foo]
=> [“hello”]
hash.inspect
=> {}

The last line “hash.inspect” is what I also find interesting, because if
there are default values, why is it still empty? I then assign a value,
overriding the default value:

You never write

Hash.new([])

… (or with any other mutable type as the argument) because the same
array is the default for every key, and that is never useful.

When you write:

hash[:non_existent_key]

… you get a reference to the single default array. Unless you
assign that value to the key, then that key will still not have
a value. So you need to write:

hash[:non_existent_key] <<= “hello”

hash[:foo] = “bar”
=> “bar”
hash.inspect
=> “{:foo=>“bar”}”

Now, inspect no longer shows an empty hash. Hmmm…

Yes. In that example you assigned something to the key.

Ruby hashes are an interesting topic and I look forward to your veteran
insight on this topic.

This is what you want:

hash = Hash.new {|hash, key| hash[key] = []}

hash[:one] << ‘hello’
hash[:two] << ‘goodbye’

p hash

–output:–
{:one=>[“hello”], :two=>[“goodbye”]}

In this case, the block executes every time you access a
non-existent key, and the block creates a new array
and assigns it to the key, and the return value is a
reference to the array that was assigned to the key.
As a result, all you need to do is append to the array.

It can be useful to use the first type of hash creator with non-mutable
types as an argument:

hash = Hash.new 0

hash[:one] += 1
hash[:two] += 1

p hash

–output:–
{:one=>1, :two=>1}

derr1ck · December 28, 2012, 9:32am

hash = Hash.new do |hash, key|
arr = []
p arr.object_id
hash[key] = arr
end

p hash[:one].object_id
p hash[:two].object_id

p hash

–output:–
2156082500
2156082500
2156082420
2156082420

derr1ck · December 28, 2012, 9:14pm

…which I currently have no idea how that could be useful, but it is
still interesting… heh

string = “Hello world, hello mars, goodbye world.”
word_count_for = Hash.new(0)

string.scan(/\b \w+ \b/x) do |word|
word = word.downcase
word_count_for[word] += 1
end

p word_count_for

–output:–
{“hello”=>2, “world”=>2, “mars”=>1, “goodbye”=>1}

derr1ck · December 28, 2012, 6:17pm

7stud – wrote in post #1090450:

You never write

Hash.new([])

… (or with any other mutable type as the argument) because the same
array is the default for every key, and that is never useful.

That makes sense, and I got that same uselessness feeling when I saw the
assertions made against different indexes.

This is what you want:

hash = Hash.new {|hash, key| hash[key] = []}

hash[:one] << ‘hello’
hash[:two] << ‘goodbye’

p hash

–output:–
{:one=>[“hello”], :two=>[“goodbye”]}

That was actually the very next test block in that “about_hashes.rb”
Koans file:

def test_default_value_with_block
hash = Hash.new {|hash, key| hash[key] = [] }

hash[:one] << "uno"
hash[:two] << "dos"

assert_equal ["uno"], hash[:one]
assert_equal ["dos"], hash[:two]
assert_equal [], hash[:three]

end

In this case, the block executes every time you access a
non-existent key, and the block creates a new array
and assigns it to the key, and the return value is a
reference to the array that was assigned to the key.
As a result, all you need to do is append to the array.

This is one of those “I get it, but then I do not get it” paradoxes. I
understand whatever is returned from the block is assigned to “hash”,
right? So, inside the block, a hash key is generated with an array as
its value:

hash[key] = []

But, since an append method was used, the value being appended, in this
case “uno” or “dos”, now becomes the value for whatever key index, :one
or :two, is provided:

hash[:one] = [“uno”]

Since the block is executed every time, a new array object is created,
so each key has different array object id. So this:

hash = hash[:one] = [“uno”]

Did this:

Unique array id ([]), appended with a string (“uno”), assigned to hash
key (:one), assigned to variable (hash).

It can be useful to use the first type of hash creator with non-mutable
types as an argument:

hash = Hash.new 0

hash[:one] += 1
hash[:two] += 1

p hash

–output:–
{:one=>1, :two=>1}

Or as an accumulator:

hash = Hash.new 0
=> {}
hash[:one]
=> 0
hash[:one] += 1
=> 1
hash[:one]
=> 1
hash[:one] += 1
=> 2
hash[:one] += 1
=> 3
hash[:one] += 1
=> 4

…which I currently have no idea how that could be useful, but it is
still interesting… heh

Looks like I am not the only one that found that Ruby Koans lesson a bit
confusing:

Thanks, that helped clear some things up.

derr1ck · December 28, 2012, 9:28pm

This is one of those “I get it, but then I do not get it” paradoxes.

You said you know C++ and perl. They both employ the concept of
“references”.

I understand whatever is returned from the block is assigned to “hash”,
right?

No. Inside the block, the code assigns an array reference to the key.
Then the block returns a reference to the same array. Two references
to the same array. Either one can be used to change the array. Here is
a simple example of that:

ref1 = {}
ref2 = ref1

ref1[:a] = 10
ref2[:b] = 20

p ref1

–output:–

{:a=>10, :b=>20}

In ruby, the following is an array constructor:

[]

Every time that array constructor executes it creates a new array:

puts [].object_id
puts [].object_id
puts [].object_id

–output:–
2156185940
2156185880
2156185840

When you write:

hash = Hash.new []

that array constructor only executes once. But when you write:

hash = Hash.new { |hash, key| hash[key] = [] }

…ruby stores the block somewhere, and then every time you try to
access a non-existent key, ruby executes the block.

derr1ck · December 28, 2012, 11:51pm

7stud – wrote in post #1090511:

…ruby stores the block somewhere, and then every time you try to
access a non-existent key, ruby executes the block.

I don’t know if you know it or not but some of ruby’s built in methods
have strange names. For instance, when you write:

hash[key] << ‘hello’

…the left side is syntactic sugar for this method call:

hash.

In that line, the method’s name is “[]”. And when the argument to the
method is a non-existent key, the [] method returns an array
reference. And of course, method calls in your code are replaced by the
method’s return value, so if you write:

hash[:non_existent] << ‘hello’

which is equivalent to:

hash. << ‘hello’

…then that becomes:

array_ref << ‘hello’

derr1ck · December 29, 2012, 12:08am

7stud – wrote in post #1090511:

No. Inside the block, the code assigns an array reference to the key.
Then the block returns a reference to the same array. Two references
to the same array. Either one can be used to change the array. Here is
a simple example of that:

ref1 = {}
ref2 = ref1

ref1[:a] = 10
ref2[:b] = 20

p ref1

–output:–

{:a=>10, :b=>20}

Whoops. An example of two references to the same array would look like
this:

ref1 = []
ref2 = ref1

ref1 << 10
ref2 << 20

p ref1

–output:–
[10, 20]

my_hash = {}

array_ref = my_hash[:one] = []
array_ref << ‘hello’ << ‘world’

–output:–
{:one=>[“hello”, “world”]}

p my_hash

derr1ck · December 29, 2012, 12:49am

7stud – wrote in post #1090510:

…which I currently have no idea how that could be useful, but it is
still interesting… heh

string = “Hello world, hello mars, goodbye world.”
word_count_for = Hash.new(0)

string.scan(/\b \w+ \b/x) do |word|
word = word.downcase
word_count_for[word] += 1
end

p word_count_for

–output:–
{“hello”=>2, “world”=>2, “mars”=>1, “goodbye”=>1}

Hey, I said “currently have no idea”! Very cool example.

derr1ck · December 29, 2012, 1:38am

7stud – wrote in post #1090511:

This is one of those “I get it, but then I do not get it” paradoxes.

You said you know C++ and perl. They both employ the concept of
“references”.

It wasn’t the reference part that was confusing, just the contents of
the block and what it was doing.

{ |hash, key| hash[key] = [] }

I understand whatever is returned from the block is assigned to “hash”,
right?

No. Inside the block, the code assigns an array reference to the key.
Then the block returns a reference to the same array. Two references
to the same array.

I had that mostly correct in my head, but two references to the same
array??? After creating the new Hash object, the hash variable is just
the Hash reference, but the indexed Hash reference ( like hash[:one] )
points to the array reference, right?

hash = Hash.new { |hash, key| hash[key] = [] }
=> {}

hash
=> {}

hash[:one]
=> []

hash.object_id
=> 269913890

hash[:one].object_id
=> 269881860

Two different id’s. Or did I do that wrong? Two references to the same
array in that assignment expression above doesn’t sound right. Your
other example:

ref1 = []
ref2 = ref1

ref1 << 10
ref2 << 20

…makes sense of two references to the same array.

In ruby, the following is an array constructor:

[]

That I do understand, because Perl is similar: my $a = [];
Ok, maybe you are about to tell me otherwise here, too, but it sure
looks similar to me! heh

perl -le ‘$a = []; push @$a, “hello”; print @$a’
–output–
hello

derr1ck · December 29, 2012, 2:01am

7stud – wrote in post #1090519:

7stud – wrote in post #1090511:

…ruby stores the block somewhere, and then every time you try to
access a non-existent key, ruby executes the block.

I don’t know if you know it or not but some of ruby’s built in methods
have strange names. For instance, when you write:

hash[key] << ‘hello’

…the left side is syntactic sugar for this method call:

hash.

Yep, read about it in section 4.4 of “The Ruby P.ming Language”,
but don’t quiz me on it yet!

derr1ck · December 29, 2012, 3:08am

Derrick B. wrote in post #1090528:

I had that mostly correct in my head, but two references to the same
array??? After creating the new Hash object, the hash variable is just
the Hash reference, but the indexed Hash reference ( like hash[:one] )
points to the array reference, right?

hash = Hash.new { |hash, key| hash[key] = [] }
=> {}

hash
=> {}

hash[:one]
=> []

hash.object_id
=> 269913890

hash[:one].object_id
=> 269881860

Two different id’s. Or did I do that wrong? Two references to the same
array in that assignment expression above doesn’t sound right. Your
other example:

The block does not create the hash. If you look at the block:

Hash.new { |hash, key| hash[key] = [] }

…you can see that the block takes a hash as an argument. The hash gets
created elsewhere.

You cannot look at a block and know what a method returns. A block
is like an argument to a method. In essence, you are passing a function
to the method. Here is an example:

def do_stuff(x, y, &block) #&block is the syntax to capture the block
#in a variable
if block_given?
block_return_val = block.call
puts “Block returns: %s” % block_return_val
end

‘hello’ #This is the return val of do_stuff
end

do_stuff_return_val = do_stuff(10, 20) {‘goodbye’}
puts “do_stuff returns: %s” % do_stuff_return_val

–output:–
Block returns: goodbye
do_stuff returns: hello

However, it is pretty obvious if you are a programmer that a method
named new is going to return a new object. So when you write:

hash = Hash.new { |hash, key| hash[key] = [] }

… hash is going to be a reference to a new Hash. Therefore, you can
deduce that the new() method of the Hash class returns a reference to a
new Hash. Note that the block returns whatever a method named []= in
the Hash class returns.

Here is a simple example of how things might work in the Hash class:

class Dog

def initialize #This method is similar to a constructor
@hash = {} #"@ variables" are instance variables

@func = lambda { |hash, key| hash[key] = [] }
      #lambda creates a function

end

def hash #getter method
@hash
end

def []=(key, val)
h[key] = val #The return val of the method is the right hand side
end

def
if hash.keys.include? key
h[key]
else
@func.call hash, key
end
end

def to_s
@hash.inspect
end

end

d = Dog.new
d[:non] << ‘hello’

p d

–output:–
{:non=>[“hello”]}

When the key doesn’t exist, the [] method returns whatever func()
returns, which is whatever the []= method returns, which is val, which
is the newly created array.

derr1ck · December 29, 2012, 4:26am

7stud – wrote in post #1090536:

The block does not create the hash. If you look at the block:

Hash.new { |hash, key| hash[key] = [] }

Agreed. Since we went from a simple:

hash = Hash.new # no argument to new

to

hash = Hash.new { |hash, key| hash[key] = [] } # block as argument to
new

I just wanted to understand how the block supplied the argument to new.
Done!

Thanks for the examples,

derr1ck · December 29, 2012, 3:53am

7stud – wrote in post #1090536:

def to_s
@hash.inspect
end

end

To be consistent, instead of using @hash in the to_s() method, the code
should use the getter:

def to_s
hash.inspect
end

derr1ck · December 29, 2012, 5:52am

Great discussion here! I now understand how to use that hash block in
initializing a hash.

derr1ck · December 29, 2012, 4:41am

Derrick B. wrote in post #1090541:

I just wanted to understand how the block supplied the argument to new.

To be clear, the block does not execute when new() is called. Instead,
new() stores the block somewhere, and only when a
non-existent key is accessed does the block execute.

Thanks for the examples,

Sure. Good luck with ruby.