Hashes versus Arrays

jeromeqc · February 3, 2010, 7:27pm

Hello,

Can someone please explain how you would go about deciding whether to
use a Hash or and Array for a given requirement. Can someone give an
idiot proof simple explanation including any pros versus cons.

Sal

jeromeqc · February 3, 2010, 7:45pm

El MiÃ©rcoles, 3 de Febrero de 2010, Jerome David S. escribiÃ³:

Hello,

Can someone please explain how you would go about deciding whether to
use a Hash or and Array for a given requirement. Can someone give an
idiot proof simple explanation including any pros versus cons.

An array is indexed by an integer (myarray[0]) while a hash is indexed
by any
Ruby object (myhash[“drinks”]).

In Ruby 1.9 Hashes are also ordered (an important difference between a
typical
hash and an array present in Ruby 1.8).

If you need to access the entries based on a numeric index Arrays are
valid
for you. If not, Hashes are great.

jeromeqc · February 4, 2010, 1:00am

On Feb 3, 12:27 pm, Jerome David S. [email protected]
wrote:

Hello,

Can someone please explain how you would go about deciding whether to
use a Hash or and Array for a given requirement. Can someone give an
idiot proof simple explanation including any pros versus cons.

Sal

Posted viahttp://www.ruby-forum.com/.

I think that the Hash container (may I use that term?) can be thought
of primarily as a “Dictionary” - for (fast) random access. What is
so darned “cool” about Hashes is that at any time the data can be: 1.
an Array, 2 Another Hash and so on… until you can give yourself a
headache.

Array are useful for “Stack” operations. I love the “pop” and “push”

reminds me of registers in a calculator. The Wee library comes with
a “HP” reverse polish “calculator” implemented in 10 lines of code or
so - amazing and so powerful - but not the stuff of most apps except
maybe sometimes inside Hash. That’s when Arrays are useful when the
ORDER is all you need. If you want the last element you can fetch it
and remove it with the thingyAr.pop command. This is a pretty common
situation. (Think keyboard commands and such)

Of course we are entirely OO in Ruby so any element can be anything -
a Web component or - oh no!!! a Hash!

Not sure it is “best practice” and I’m waiting to see how it works out
BUT I have been edging toward fewer containers and using them for more
and more. Many of my methods in my web related work return Hashes now
so I can put names to the data. Using an array and trying to remember
position for the same purpose would be really error prone.

Hashes have some really nice features also!

Like this one: myValue = someHash.fetch( :thingyAr, [] )

What’s that about? if the symbol :thingyAr is not found an empty
array is returned - thus you can fall right into your code to do
something with the Array without testing for that annoying “nil”.

jeromeqc · February 4, 2010, 6:57am

On Wed, Feb 3, 2010 at 12:27 PM, Jerome David S. <
[email protected]> wrote:

Can someone please explain how you would go about deciding whether to
use a Hash or and Array for a given requirement. Can someone give an
idiot proof simple explanation including any pros versus cons.

Understanding what happens under the covers will help you know what is
right
for your situation, so first a quick explanation of arrays and hashes.

An array is a section of consecutive memory (Ruby’s array class is more
abstract, but generally you can expect something like this). Since the
array
doesn’t actually store the object, but instead stores the reference to
the
object, and all references are the same size, the array can always know
where the nth item is (for some number n). It also means that the items
are
inherently ordered, ie the first element you put in will be at index
zero,
the second will be at index 1, and so on. If you want to see that order
again, you look at index 0, then index 1, and so on. So if you want
order,
then an array makes sense. If you want to be able to call the sixth
item,
then you know right were it is, it is in index 5 (not 5, because we
start
counting at zero). And we know where index 5 is, because the references
to
the objects have a fixed width. So it’s in memory location:
location_of_array + size_of_reference * desired_index, and bam, we are
there. So it is very efficient.

But what happens if you don’t know where the item you are interested in
is
located? Well, then you have to search through the whole array (there
are
also searching algorithms, if you can guarantee certain properties about
the
array)

However, sometimes you are not so interested in the order things went
in, as
you are with relating two pieces of information together.

Enter the hash table. A hash table has a “key” and a “value” it uses the
key
in the same way that an array uses an index, to retrieve the value.
Underneath the hash table, is an array, and every key is mapped to an
array
index. To do this, it looks at the object’s contents, and comes up with
a
number (the algorithm used will depend on the object, and how the
creator
decided to implement it). That index is then likely to be where the
object
is within the array. The key implies an index, but within the array,
there
is no ordering of the contents.

You figure out what index it should be located at by deriving some
number
from the object, then translating that number onto the array holding the
items. Perhaps the array has ten indexes, and six items in it, then when
it
is asked to look for the index containing some object, it figures out
that
object’s hash number, and translates it to an array index (probably mods
it
by ten).

You can see these hash numbers by calling #hash on them here is an
example:

x = “abc”
y = “abc”
z = “def”

x.object_id # => 75900
y.object_id # => 75890

x.hash # => 833038373
y.hash # => 833038373

z.hash # => 858800354

x.hash % 10 # => 3
y.hash % 10 # => 3
z.hash % 10 # => 4

Notice that x and y are both different objects (they have different
object
IDs), but they contain the same information, two different strings of
“abc”
and look at their hash values, they are the same. When String defines
#hash,
it somehow looks at the character array underneath the string, and comes
up
with a number (in this case, 833038373). This is why two different
objects
with the same contents have the same hash value, and if we assume an
array
of size ten, then we would expect the string “abc” to be in index number
3.
The string “def” has different contents, and thus a different hash
value,
and it maps to a different index.

So by looking at a key, we determine which index the object will map to,
and
go see if it is there (though there can be some complications when
things
collide).

So, generally, if you are interested simply storing a bunch of items, or
in
storing a bunch of items with an ordering, then an array makes sense. If
you
are interested in mapping one object to another, then a hash makes
sense.
(note that in 1.9, hashes have ordering also, but you still can’t say
“give
me the 8th item”). If you have ten strings and I just want to keep them
somewhere for later, use an array. If you have four database records,
and
want to display them in order, use an array. If you want to pass a bunch
of
fields from a form that was submitted, where the form input will be
accessed
based on the name of the field, use an array.

These are not hard rules, you need to think about what you are doing,
and
which data structure fits your needs. Also, depending on what you do,
arrays
can ultimately be most other data structures, you saw underneath of it,
a
hash table holds an array, it just defines different rules for how to
access
elements. There are lots of examples like this, as thunk pointed out,
using
the array methods push and pop will give you a stack (place an item into
the
array, and remove it again, where the first one you put in is the last
one
you get back out, think of a stack of plates, they put newly cleaned
plates
on top of the stack, and you pull the plate you want from the top of the
stack). Using push and shift will give you a queue (same as a stack,
except
the first one you put in will be the first one you get out, think of a
line
of people waiting for food in a caffeteria). But what I said earlier
should
probably give you a fairly good idea which of the two you should be
considering first.

On Wed, Feb 3, 2010 at 6:00 PM, thunk [email protected] wrote:

I think that the Hash container (may I use that term?)

I think the common terms are “hash” or “hash table”.

On Wed, Feb 3, 2010 at 9:44 PM, Marnen Laibow-Koser
[email protected]wrote:

BTW, camelCase is considered poor style in Ruby; use underscore_case
instead.

I’ve always heard underscore_case called snake_case, Textmate, for
example,
calls them: camelCase / snake_case / PascalCase
(you can toggle between the three with ^_ as defined in the “Source”
bundle)

jeromeqc · February 4, 2010, 4:44am

thunk wrote:

On Feb 3, 12:27ï¿½pm, Jerome David S. [email protected]
wrote:

Hello,

Can someone please explain how you would go about deciding whether to
use a Hash or and Array for a given requirement. Can someone give an
idiot proof simple explanation including any pros versus cons.

Sal

Posted viahttp://www.ruby-forum.com/.

I think that the Hash container (may I use that term?) can be thought
of primarily as a “Dictionary” - for (fast) random access.

So can an Array. The only difference is that array indices have to be
numeric.

What is
so darned “cool” about Hashes is that at any time the data can be: 1.
an Array, 2 Another Hash and so on… until you can give yourself a
headache.

The same is true of arrays.

Array are useful for “Stack” operations.

But that’s not really their primary purpose.

I love the “pop” and “push”

reminds me of registers in a calculator. The Wee library comes with
a “HP” reverse polish “calculator” implemented in 10 lines of code or
so - amazing and so powerful - but not the stuff of most apps except
maybe sometimes inside Hash.

Uh, what? That last phrase appears not to make sense.

That’s when Arrays are useful when the
ORDER is all you need. If you want the last element you can fetch it
and remove it with the thingyAr.pop command. This is a pretty common
situation. (Think keyboard commands and such)

Right, although you’d probably want a more sophisticated Stack object…

Of course we are entirely OO in Ruby so any element can be anything -
a Web component or - oh no!!! a Hash!

Right. Or anything else.

Not sure it is “best practice” and I’m waiting to see how it works out
BUT I have been edging toward fewer containers and using them for more
and more. Many of my methods in my web related work return Hashes now
so I can put names to the data.

If you want to put names to the data, you should be using value objects,
not hashes.

Using an array and trying to remember
position for the same purpose would be really error prone.

Yup. And using a Hash is also error-prone. Define a custom value
object.

Hashes have some really nice features also!

Like this one: myValue = someHash.fetch( :thingyAr, [] )

What’s that about? if the symbol :thingyAr is not found an empty
array is returned - thus you can fall right into your code to do
something with the Array without testing for that annoying “nil”.

Pretty useless, considering that you can use || for the same thing

BTW, camelCase is considered poor style in Ruby; use underscore_case
instead.

Best,
–Â
Marnen Laibow-Koser
http://www.marnen.org
[email protected]

jeromeqc · February 6, 2010, 8:41pm

People may be over-complicating things a bit.

On Wednesday 03 February 2010 12:27:10 pm Jerome David S. wrote:

Can someone please explain how you would go about deciding whether to
use a Hash or and Array for a given requirement.

Is it just a list of values, ordered or not? Put it in an Array.
Example:

friends_list = [‘Tom’, ‘Susie’, ‘Yosef’]

A hash is an associative array, not necessarily sorted. Do you have
things you
need to associate?

people_list = {‘Tom’ => :friend, ‘Joe’ => :enemy}
people_list[‘Tom’] # should return :friend
people_list[‘Joe’] = :friend #guess we made up

These are entirely different patterns. You almost never have a
requirement
that would make sense for either.

People are complicating this by the fact that arrays can be indexed by
position. For example:

friends_list[1] # who’s my second friend?

But in high-level languages like Ruby, especially when you’re starting
out,
you probably don’t need to be doing that.

Can someone give an
idiot proof simple explanation

No. You’re a programmer. If you’re also an idiot, you end up on
thedailywtf.
“Idiot-proof programming” is neither.

jeromeqc · February 7, 2010, 12:01am

On Feb 6, 1:40 pm, David M. [email protected] wrote:

friends_list = [‘Tom’, ‘Susie’, ‘Yosef’]

I am touched to be included in this example.

jeromeqc · February 6, 2010, 10:58pm

Thank you David M., an explanation and a patronising comment, what
more could one ask of a fellow human on a Saturday night.

jeromeqc · February 7, 2010, 2:13am

On Saturday 06 February 2010 03:58:17 pm Jerome David S. wrote:

Thank you David M., an explanation and a patronising comment, what
more could one ask of a fellow human on a Saturday night.

If by this you mean my “idiot-proof” comment… my point here wasn’t to
accuse
you of being an idiot, but rather, that attempts to make this kind of
thing
“idiot-proof” don’t work. I’d much rather start by assuming you’re
intelligent.

jeromeqc · February 7, 2010, 2:55pm

I must admit to you that yes I am in idiot and proud of it

jeromeqc · February 8, 2010, 9:57am

2010/2/7 Jerome David S. [email protected]:

I must admit to you that yes I am in idiot and proud of it

Wrong group, please repost at alt.soc.confessions.

Cheers

robert