Forum: Ruby Hashes versus Arrays

Posted by Jerome David Sallinger (jdsallinger)
on 2010-02-03 19:27
Hello,

Can someone please explain how you would go about deciding whether to
use a Hash or and Array for a given requirement. Can someone give an
idiot proof simple explanation including any pros versus cons.

Sal
Posted by Iñaki Baz Castillo (Guest)
on 2010-02-03 19:45
(Received via mailing list)
El Miércoles, 3 de Febrero de 2010, Jerome David Sallinger escribió:
> Hello,
> 
> Can someone please explain how you would go about deciding whether to
> use a Hash or and Array for a given requirement. Can someone give an
> idiot proof simple explanation including any pros versus cons.

An array is indexed by an integer (myarray[0]) while a hash is indexed 
by any
Ruby object (myhash["drinks"]).

In Ruby 1.9 Hashes are also ordered (an important difference between a 
typical
hash and an array present in Ruby 1.8).

If you need to access the entries based on a numeric index Arrays are 
valid
for you. If not, Hashes are great.
Posted by thunk (Guest)
on 2010-02-04 01:00
(Received via mailing list)
On Feb 3, 12:27 pm, Jerome David Sallinger <imran.na...@yahoo.co.uk>
wrote:
> Hello,
>
> Can someone please explain how you would go about deciding whether to
> use a Hash or and Array for a given requirement. Can someone give an
> idiot proof simple explanation including any pros versus cons.
>
> Sal
> --
> Posted viahttp://www.ruby-forum.com/.

I think that the Hash container (may I use that term?) can be thought
of primarily as a "Dictionary" -  for (fast) random access.  What is
so darned "cool" about Hashes is that at any time the data can be: 1.
an Array, 2 Another Hash and so on.... until you can give yourself a
headache.

Array are useful for "Stack" operations.  I love the "pop" and "push"
- reminds me of registers in a calculator.  The Wee library comes with
a "HP" reverse polish "calculator" implemented in 10 lines of code or
so - amazing and so powerful - but not the stuff of most apps except
maybe sometimes inside Hash.  That's when Arrays are useful when the
ORDER is all you need.  If you want the last element you can fetch it
and remove it with the thingyAr.pop command.  This is a pretty common
situation.  (Think keyboard commands and such)

Of course we are entirely OO in Ruby so any element can be anything -
a Web component or - oh no!!!  a Hash!

Not sure it is "best practice" and I'm waiting to see how it works out
BUT I have been edging toward fewer containers and using them for more
and more.  Many of my methods in my web related work return Hashes now
so I can put names to the data.  Using an array and trying to remember
position for the same purpose would be really error prone.

Hashes have some really nice features also!

Like this one:    myValue = someHash.fetch( :thingyAr, [] )

What's that about?    if the symbol :thingyAr is not found an empty
array is returned - thus you can fall right into your code to do
something with the Array without testing for that annoying "nil".
Posted by Marnen Laibow-Koser (marnen)
on 2010-02-04 04:44
thunk wrote:
> On Feb 3, 12:27�pm, Jerome David Sallinger <imran.na...@yahoo.co.uk>
> wrote:
>> Hello,
>>
>> Can someone please explain how you would go about deciding whether to
>> use a Hash or and Array for a given requirement. Can someone give an
>> idiot proof simple explanation including any pros versus cons.
>>
>> Sal
>> --
>> Posted viahttp://www.ruby-forum.com/.
> 
> I think that the Hash container (may I use that term?) can be thought
> of primarily as a "Dictionary" -  for (fast) random access.  

So can an Array.  The only difference is that array indices have to be 
numeric.

> What is
> so darned "cool" about Hashes is that at any time the data can be: 1.
> an Array, 2 Another Hash and so on.... until you can give yourself a
> headache.

The same is true of arrays.

> 
> Array are useful for "Stack" operations.  

But that's not really their primary purpose.

> I love the "pop" and "push"
> - reminds me of registers in a calculator.  The Wee library comes with
> a "HP" reverse polish "calculator" implemented in 10 lines of code or
> so - amazing and so powerful - but not the stuff of most apps except
> maybe sometimes inside Hash.

Uh, what?  That last phrase appears not to make sense.

>  That's when Arrays are useful when the
> ORDER is all you need.  If you want the last element you can fetch it
> and remove it with the thingyAr.pop command.  This is a pretty common
> situation.  (Think keyboard commands and such)

Right, although you'd probably want a more sophisticated Stack object...

> 
> Of course we are entirely OO in Ruby so any element can be anything -
> a Web component or - oh no!!!  a Hash!

Right.  Or anything else.

> 
> Not sure it is "best practice" and I'm waiting to see how it works out
> BUT I have been edging toward fewer containers and using them for more
> and more.  Many of my methods in my web related work return Hashes now
> so I can put names to the data.  

If you want to put names to the data, you should be using value objects, 
not hashes.

> Using an array and trying to remember
> position for the same purpose would be really error prone.
> 

Yup.  And using a Hash is also error-prone.  Define a custom value 
object.

> Hashes have some really nice features also!
> 
> Like this one:    myValue = someHash.fetch( :thingyAr, [] )
> 
> What's that about?    if the symbol :thingyAr is not found an empty
> array is returned - thus you can fall right into your code to do
> something with the Array without testing for that annoying "nil".

Pretty useless, considering that you can use || for the same thing

BTW, camelCase is considered poor style in Ruby; use underscore_case 
instead.

Best,
-- 
Marnen Laibow-Koser
http://www.marnen.org
marnen@marnen.org
Posted by Josh Cheek (Guest)
on 2010-02-04 06:57
(Received via mailing list)
On Wed, Feb 3, 2010 at 12:27 PM, Jerome David Sallinger <
imran.nazir@yahoo.co.uk> wrote:

> Can someone please explain how you would go about deciding whether to
> use a Hash or and Array for a given requirement. Can someone give an
> idiot proof simple explanation including any pros versus cons.
>

Understanding what happens under the covers will help you know what is 
right
for your situation, so first a quick explanation of arrays and hashes.

An array is a section of consecutive memory (Ruby's array class is more
abstract, but generally you can expect something like this). Since the 
array
doesn't actually store the object, but instead stores the reference to 
the
object, and all references are the same size, the array can always know
where the nth item is (for some number n). It also means that the items 
are
inherently ordered, ie the first element you put in will be at index 
zero,
the second will be at index 1, and so on. If you want to see that order
again, you look at index 0, then index 1, and so on. So if you want 
order,
then an array makes sense. If you want to be able to call the sixth 
item,
then you know right were it is, it is in index 5 (not 5, because we 
start
counting at zero). And we know where index 5 is, because the references 
to
the objects have a fixed width. So it's in memory location:
location_of_array + size_of_reference * desired_index, and bam, we are
there. So it is very efficient.

But what happens if you don't know where the item you are interested in 
is
located? Well, then you have to search through the whole array (there 
are
also searching algorithms, if you can guarantee certain properties about 
the
array)

However, sometimes you are not so interested in the order things went 
in, as
you are with relating two pieces of information together.

Enter the hash table. A hash table has a "key" and a "value" it uses the 
key
in the same way that an array uses an index, to retrieve the value.
Underneath the hash table, is an array, and every key is mapped to an 
array
index. To do this, it looks at the object's contents, and comes up with 
a
number (the algorithm used will depend on the object, and how the 
creator
decided to implement it). That index is then likely to be where the 
object
is within the array. The key implies an index, but within the array, 
there
is no ordering of the contents.

You figure out what index it should be located at by deriving some 
number
from the object, then translating that number onto the array holding the
items. Perhaps the array has ten indexes, and six items in it, then when 
it
is asked to look for the index containing some object, it figures out 
that
object's hash number, and translates it to an array index (probably mods 
it
by ten).

You can see these hash numbers by calling #hash on them here is an 
example:

x = "abc"
y = "abc"
z = "def"

x.object_id # => 75900
y.object_id # => 75890

x.hash # => 833038373
y.hash # => 833038373

z.hash # => 858800354

x.hash % 10 # => 3
y.hash % 10 # => 3
z.hash % 10 # => 4

Notice that x and y are both different objects (they have different 
object
IDs), but they contain the same information, two different strings of 
"abc"
and look at their hash values, they are the same. When String defines 
#hash,
it somehow looks at the character array underneath the string, and comes 
up
with a number (in this case, 833038373). This is why two different 
objects
with the same contents have the same hash value, and if we assume an 
array
of size ten, then we would expect the string "abc" to be in index number 
3.
The string "def" has different contents, and thus a different hash 
value,
and it maps to a different index.

So by looking at a key, we determine which index the object will map to, 
and
go see if it is there (though there can be some complications when 
things
collide).


So, generally, if you are interested simply storing a bunch of items, or 
in
storing a bunch of items with an ordering, then an array makes sense. If 
you
are interested in mapping one object to another, then a hash makes 
sense.
(note that in 1.9, hashes have ordering also, but you still can't say 
"give
me the 8th item"). If you have ten strings and I just want to keep them
somewhere for later, use an array. If you have four database records, 
and
want to display them in order, use an array. If you want to pass a bunch 
of
fields from a form that was submitted, where the form input will be 
accessed
based on the name of the field, use an array.

These are not hard rules, you need to think about what you are doing, 
and
which data structure fits your needs. Also, depending on what you do, 
arrays
can ultimately be most other data structures, you saw underneath of it, 
a
hash table holds an array, it just defines different rules for how to 
access
elements. There are lots of examples like this, as thunk pointed out, 
using
the array methods push and pop will give you a stack (place an item into 
the
array, and remove it again, where the first one you put in is the last 
one
you get back out, think of a stack of plates, they put newly cleaned 
plates
on top of the stack, and you pull the plate you want from the top of the
stack). Using push and shift will give you a queue (same as a stack, 
except
the first one you put in will be the first one you get out, think of a 
line
of people waiting for food in a caffeteria). But what I said earlier 
should
probably give you a fairly good idea which of the two you should be
considering first.

On Wed, Feb 3, 2010 at 6:00 PM, thunk <gmkoller@gmail.com> wrote:

> I think that the Hash container (may I use that term?)


I think the common terms are "hash" or "hash table".

On Wed, Feb 3, 2010 at 9:44 PM, Marnen Laibow-Koser 
<marnen@marnen.org>wrote:

> BTW, camelCase is considered poor style in Ruby; use underscore_case
> instead.
>

I've always heard underscore_case called snake_case, Textmate, for 
example,
calls them: camelCase / snake_case / PascalCase
(you can toggle between the three with ^_ as defined in the "Source" 
bundle)
Posted by David Masover (Guest)
on 2010-02-06 20:41
(Received via mailing list)
People may be over-complicating things a bit.

On Wednesday 03 February 2010 12:27:10 pm Jerome David Sallinger wrote:
> Can someone please explain how you would go about deciding whether to
> use a Hash or and Array for a given requirement.

Is it just a list of values, ordered or not? Put it in an Array. 
Example:

friends_list = ['Tom', 'Susie', 'Yosef']

A hash is an associative array, not necessarily sorted. Do you have 
things you
need to associate?

people_list = {'Tom' => :friend, 'Joe' => :enemy}
people_list['Tom']            # should return :friend
people_list['Joe'] = :friend  #guess we made up

These are entirely different patterns. You almost never have a 
requirement
that would make sense for either.

People are complicating this by the fact that arrays can be indexed by
position. For example:

friends_list[1]   # who's my second friend?

But in high-level languages like Ruby, especially when you're starting 
out,
you probably don't need to be doing that.

> Can someone give an
> idiot proof simple explanation

No. You're a programmer. If you're also an idiot, you end up on 
thedailywtf.
"Idiot-proof programming" is neither.
Posted by Jerome David Sallinger (jdsallinger)
on 2010-02-06 22:58
Thank you David Masover, an explanation and a patronising comment, what
more could one ask of a fellow human on a Saturday night.
Posted by Yossef Mendelssohn (Guest)
on 2010-02-07 00:01
(Received via mailing list)
On Feb 6, 1:40 pm, David Masover <ni...@slaphack.com> wrote:
> friends_list = ['Tom', 'Susie', 'Yosef']

I am touched to be included in this example.
Posted by David Masover (Guest)
on 2010-02-07 02:13
(Received via mailing list)
On Saturday 06 February 2010 03:58:17 pm Jerome David Sallinger wrote:
> Thank you David Masover, an explanation and a patronising comment, what
> more could one ask of a fellow human on a Saturday night.

If by this you mean my "idiot-proof" comment... my point here wasn't to 
accuse
you of being an idiot, but rather, that attempts to make this kind of 
thing
"idiot-proof" don't work. I'd much rather start by assuming you're
intelligent.
Posted by Jerome David Sallinger (jdsallinger)
on 2010-02-07 14:55
I must admit to you that yes I am in idiot and proud of it ;)
Posted by Robert Klemme (Guest)
on 2010-02-08 09:57
(Received via mailing list)
2010/2/7 Jerome David Sallinger <imran.nazir@yahoo.co.uk>:
> I must admit to you that yes I am in idiot and proud of it ;)

Wrong group, please repost at alt.soc.confessions. ;-)

Cheers

robert
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.