Making a counter for each word's occurrences in a string

By looking at this method below, I couldn’t understand a few things.
Let me list the method first!

  1. def count_frequency(word_list)
  2. counts = Hash.new(0)
  3. for word in word_list
  4. counts[word] += 1
    
  5. end
  6. counts
  7. end

p count_frequency([“sparky”, “the”, “cat”, “sat”, “on”, “the”, “mat”])


that code above produces {“sparky” => 1, “the” => 2, “cat”=>1, “on”=>1,
“mat”=>1}

What I don’t understand is that why line 6 is there. OK, like 4, it
seems counts[word] combines to pin point which word and add 1 to the
counter of that word, right? Because I’m confuse how ruby works as it’s
so compact and nice, and under the hood I have no idea. So line 2 is to
create an empty hash with 0 item and deposits as counts (object)? Using
for loop to go through each word in word_list, adding 1 to counts[word],
but I thought counts[word] produces a position of a word and not a
counter. OK, you can say I’m completely confused about the whole code
above.

Help?

Thanks in advance.

Ben B. wrote:

By looking at this method below, I couldn’t understand a few things.
Let me list the method first!

  1. def count_frequency(word_list)
  2. counts = Hash.new(0)
  3. for word in word_list
  4. counts[word] += 1
    
  5. end
  6. counts
  7. end

p count_frequency([“sparky”, “the”, “cat”, “sat”, “on”, “the”, “mat”])


that code above produces {“sparky” => 1, “the” => 2, “cat”=>1, “on”=>1,
“mat”=>1}

What I don’t understand is that why line 6 is there. OK, like 4, it
seems counts[word] combines to pin point which word and add 1 to the
counter of that word, right? Because I’m confuse how ruby works as it’s
so compact and nice, and under the hood I have no idea. So line 2 is to
create an empty hash with 0 item and deposits as counts (object)? Using
for loop to go through each word in word_list, adding 1 to counts[word],
but I thought counts[word] produces a position of a word and not a
counter. OK, you can say I’m completely confused about the whole code
above.

Help?

Thanks in advance.

Also I forgot to add that how on earth ruby in the end knows to produce
sparky =>1, the =>2, cat =>1, etc… even though the code above doesn’t
seem to be relate the counter with the word anywhere, and yet using p
command it knows to relate each word with its occurrence’s counts.

On Friday 25 December 2009, Ben B. wrote:

|
|create an empty hash with 0 item and deposits as counts (object)? Using
|for loop to go through each word in word_list, adding 1 to counts[word],
|but I thought counts[word] produces a position of a word and not a
|counter. OK, you can say I’m completely confused about the whole code
|above.
|
|Help?
|
|Thanks in advance.

Line two creates an empty hash which uses 0 as default value. This means
that
if you ask the value of a key which doesn’t exist, it’ll return 0. If
this
default value hadn’t been specified (that is, if that line had been

counts = Hash.new

or, equivalently,

counts = {}
), the defalut value would have been nil, instead.

Inside the for loop (that is, for each word in the list) the value
associated
to the word inside the hash is increased by one (here you see the
usefulness
of setting the default value of the hash to 0: if we hadn’t done this,
every
time we should have checked whether the entry was a number or nil). This
has
nothing to do with the position of the word. Each value in the hash is
simply
the number of times the corresponding word has been seen in the word
list.

For example, supppose the word list is:

[‘a’, ‘b’, ‘a’, ‘c’, ‘b’, ‘a’]

Here’s how the hash (which is initially empty) becomes in the various
iterations:

First iteration (word: ‘a’): {‘a’ => 1}
Second iteration (word: ‘b’): {‘a’ => 1, ‘b’ => 1}
Third iteration (word: ‘a’): {‘a’ => 2, ‘b’ => 1}
Fourth iteration (word: ‘c’): {‘a’ => 2, ‘b’ => 1, ‘c’ => 1}
Fifth iteration (word: ‘b’): {‘a’ => 2, ‘b’ => 2, ‘c’ => 1}
Sixth iteration (word: ‘a’): {‘a’ => 3, ‘b’ => 2, ‘c’ => 1}

The last line of the method, count, is there because in ruby a method
returns
the value returned by the last expression (unless the “return” keyword
is
used). Without that last line, the last expression would be the “for”
loop,
which returns the object we’re iterating on (in this case, the word
list).
We’re not interested in returning the word list, however: we need the
word
count, which is stored in the variable count. Putting a line containing
only
the name of the variable at the end of the method, we make sure that the
value
contained in the variable is returned. If it’s clearer to you, you can
think
the last line as if it were:

return count

In this case, the return keyword has no effect, since we’re already at
the end
of the method. However, it may make clearer the meaning of the
expression.
(The return keyword more or less means: don’t go on executing the
method, stop
immediately and return to the calling method the value given to return,
or nil
if return is called without arguments).

I hope this helps

Stefano

In this case, the return keyword has no effect, since we’re already at
the end
of the method. However, it may make clearer the meaning of the
expression.
(The return keyword more or less means: don’t go on executing the
method, stop
immediately and return to the calling method the value given to return,
or nil
if return is called without arguments).

I hope this helps

Stefano

Thanks Stefano, I got the idea! Especially when you explain the line 6,
it’s clearer to me now. So whatever for loop did before line 6, all of
those are stored in memory, then when you called line 6, it will return
only the value part that got added to the hash a moment ago at one go
right? I hope that is what you mean.

On 25.12.2009 21:58, Ben B. wrote:

Also I forgot to add that how on earth ruby in the end knows to produce
sparky =>1, the =>2, cat =>1, etc… even though the code above doesn’t
seem to be relate the counter with the word anywhere, and yet using p
command it knows to relate each word with its occurrence’s counts.

I modified your script a little:

def count_frequency(word_list)
counts = Hash.new(0)
for word in word_list
counts[word] += 1
end
puts “counts’ class: #{counts.class}”
puts “inspect counts: #{counts.inspect}”
puts “counts’ Hash keys: #{counts.keys.join(”; “)}”
counts
end

p count_frequency([“sparky”, “the”, “cat”, “sat”, “on”, “the”, “mat”])

Output:
c:\Scripts>ruby word_freq.rb
counts’ class: Hash
inspect counts: {“mat”=>1, “cat”=>1, “sat”=>1, “the”=>2, “on”=>1,
“sparky”=>1}
counts’ Hash keys: mat; cat; sat; the; on; sparky
{“mat”=>1, “cat”=>1, “sat”=>1, “the”=>2, “on”=>1, “sparky”=>1}

The mystery is solved in line 4:
counts[word] += 1
which tells Ruby to use “word” as the name for the key. If the key
doesn’t exist, it is created, with the count of “1”. Further ocurrences
increment the count (obviously enough).

Phillip G. wrote:

On 25.12.2009 22:47, Ben B. wrote:

Thanks Stefano, I got the idea! Especially when you explain the line 6,
it’s clearer to me now. So whatever for loop did before line 6, all of
those are stored in memory, then when you called line 6, it will return
only the value part that got added to the hash a moment ago at one go
right? I hope that is what you mean.

Close. “counts”, or, by extension, “return counts”, hands the result of
the method back to the caller. It includes all the results the method
has produced (in your example, the word count).

To illustrate with a metaphor:
A method is like a production line at a factory.

It takes raw materials (input), and produces a good (a result). This
production line can be simple, like making nails out of metal bits (a
word count), to a whole car (a webapp, or a desktop application).

Thanks Phillip, I got it now. Awesome! Ya, it is going to take me a
while before I can truly know Ruby and programming in general.

On 25.12.2009 22:47, Ben B. wrote:

Thanks Stefano, I got the idea! Especially when you explain the line 6,
it’s clearer to me now. So whatever for loop did before line 6, all of
those are stored in memory, then when you called line 6, it will return
only the value part that got added to the hash a moment ago at one go
right? I hope that is what you mean.

Close. “counts”, or, by extension, “return counts”, hands the result of
the method back to the caller. It includes all the results the method
has produced (in your example, the word count).

To illustrate with a metaphor:
A method is like a production line at a factory.

It takes raw materials (input), and produces a good (a result). This
production line can be simple, like making nails out of metal bits (a
word count), to a whole car (a webapp, or a desktop application).

On 26.12.2009 00:21, Ben B. wrote:

Thanks Phillip, I got it now. Awesome! Ya, it is going to take me a
while before I can truly know Ruby and programming in general.

You are welcome. :slight_smile:

Hi,

the “inject” wya to do what you want is:

[“sparky”, “the”, “cat”, “sat”, “on”, “the”, “mat”].inject(Hash.new(0))
{
|c,w| c[w] +=1; c }
=> {“sparky”=>1, “the”=>2, “cat”=>1, “sat”=>1, “on”=>1, “mat”=>1}

That’s kind of very short to write, isn’t it ?

Ruby is awesome :smiley:

2009/12/26 Phillip G. [email protected]