Symbol vs string for hash keys

What are the important factors to consider when deciding whether to
use symbols or strings for hash keys?

I ask b/c I noticed that fileutils.rb uses strings[1], though the keys
represent method names. Seems to me that symbols would be more
appropriate.

Even so it would be good to know the general criteria to consider.

[1]https://github.com/ruby/ruby/blob/trunk/lib/fileutils.rb
(OPT_TABLE)

Maybe so:

if you want to print out the key somewhere, use string as key; if not,
use symbols.

Even so it would be good to know the general criteria to consider.

Conceptually, I like to use these rules of thumb (I think Jim W.
noted it originally, but I am not sure):

1.) If the content and exact sequence of characters is the important
part, use a string.

2.) If the identity is the important part, use a symbol.

~ jf

John F.
Principal Consultant, BitsBuilder
LI: http://www.linkedin.com/in/johnxf
SO: User John Feminella - Stack Overflow

On Sun, Jul 3, 2011 at 6:33 PM, Chad P. [email protected] wrote:

Basically, if it is more important to be able to operate on the names of
your hash keys, strings are appropriate; if performance is more
important, or you are sure no such operation on the names will be
necessary, symbols are appropriate.

And, of course, you can convert to and from symbols/strings. Beware
the spaces, though:

“a space”.to_sym
=> :“a space”
:“a space”.to_s
=> “a space”
:symbolic_snake.to_s
=> “symbolic_snake”


Phillip G.

twitter.com/phgaw
phgaw.posterous.com

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibniz

On Mon, Jul 04, 2011 at 12:28:03AM +0900, Intransition wrote:

(OPT_TABLE)
Strings are mutable; symbols are not.

There is only one instance of any given literal symbol; there can be
many
instances of a given literal string.

Basically, if it is more important to be able to operate on the names of
your hash keys, strings are appropriate; if performance is more
important, or you are sure no such operation on the names will be
necessary, symbols are appropriate.

It also makes sense to consider the fact that if you’re selecting hash
keys based on some kind of input, strings are easier – because inputs
tend to be strings rather than symbols, and would thus need to be
translated into symbols if you use symbols as your hash keys.

That’s the ugly, hand-wavy, implementation-aware answer, I guess. The
more conceptual answer is that symbols are identities and strings are
data.

I suspect strings are used in the example you provided because it is
expected that when dealing with OPT_TABLE one might operate on a string
in some manner before using it as a hash key. I’m really not sure,
though.

Hi all,

First, thanks to Jeremy for the Ruby Koans link.

Now I’m puzzled, Sensei!

I’m trying to write the code for the cases on lines 12 & 13 (see below)
and I don’t understand why they should be expected to raise exceptions.
They appear to be isosceles triangles and pass the isosceles test.

Am I missing something subtle? Or something obvious?

**Leigh

“Things are not as they appear; nor are they otherwise.”


The Master says:
You have not yet reached enlightenment.
I sense frustration. Do not be afraid to ask for help.

The answers you seek…
TriangleError expected but nothing was raised.

Please meditate on the following code:
/Users/leigh/koans/about_triangle_project_2.rb:12:in
`test_illegal_triangles_throw_exceptions’

def test_illegal_triangles_throw_exceptions
assert_raise(TriangleError) do triangle(0, 0, 0) end
assert_raise(TriangleError) do triangle(3, 4, -5) end
assert_raise(TriangleError) do triangle(1, 1, 3) end # line 12
assert_raise(TriangleError) do triangle(2, 4, 2) end # line 13
end

These tests all pass, including two from the illegal exceptions test:

def test_isosceles_triangles_have_exactly_two_sides_equal
assert_equal :isosceles, triangle(3, 4, 4)
assert_equal :isosceles, triangle(4, 3, 4)
assert_equal :isosceles, triangle(4, 4, 3)
assert_equal :isosceles, triangle(10, 10, 2)

assert_equal :isosceles, triangle(1, 1, 3) # these two are from
assert_equal :isosceles, triangle(2, 4, 2) # 

test_illegal_triangles_throw_exceptions
end

Here’s the triangle method:

def triangle(a, b, c)
if (a <= 0) || (b <= 0) || (c <= 0)
raise TriangleError
end
case
when (a == b) && (a == c) && (b == c)
return :equilateral
when (b == c) || (a == c) || (a == b)
return :isosceles
when !(a == b) && !(a == c) && !(b == c)
return :scalene
else
return :oops!
end
end

I love the pencil!

Thanks, Bill.

**Leigh

On 2011-07-03 1:32 PM, Leigh D. wrote:

**Leigh
assert_equal :isosceles, triangle(1, 1, 3) # these two are from
assert_equal :isosceles, triangle(2, 4, 2) #
test_illegal_triangles_throw_exceptions

Are these really triangles? Try to draw them.

– Bill

> Strings are mutable; symbols are not. And, IIRC, strings are GCed, while syms are not. I still use symbols whenever it seems to make sense though. But in case of enormous transient data that might become a problem. Cheers Robert

On Jul 3, 2:31pm, Robert D. [email protected] wrote:

> Strings are mutable; symbols are not.

And, IIRC, strings are GCed, while syms are not. I still use symbols
whenever it seems to make sense though.
But in case of enormous transient data that might become a problem.

Yea, that’s what I was wondering about. In the case of fileutils, I
don’t think they are ever going to be GCed, or don’t need to be. So I
thought, well maybe there is an upper limit to the number symbols? And
so symbols were avoided simply to not add to that count? But that
seems unlikely. Probably this fileutils code was originally written
before 1.9 moved all method lists to symbols, and it’s just stayed
that way.

Hi:

A valid triangle must pass following test:

  • Sum of any two sides must be greater than third side.
    i.e. (a + b) > c && (c + a) > b && (b + c) > a

btw: last check is not necessary in equilateral

when (a == b) && (a == c) && (b == c)

return :equilateral

Regards,
UNShetty

Leigh D. wrote in post #1008909:

Hi all,

First, thanks to Jeremy for the Ruby Koans link.

Now I’m puzzled, Sensei!

I’m trying to write the code for the cases on lines 12 & 13 (see below)
and I don’t understand why they should be expected to raise exceptions.
They appear to be isosceles triangles and pass the isosceles test.

Am I missing something subtle? Or something obvious?

**Leigh

“Things are not as they appear; nor are they otherwise.”


The Master says:
You have not yet reached enlightenment.
I sense frustration. Do not be afraid to ask for help.

The answers you seek…
TriangleError expected but nothing was raised.

Please meditate on the following code:
/Users/leigh/koans/about_triangle_project_2.rb:12:in
`test_illegal_triangles_throw_exceptions’

def test_illegal_triangles_throw_exceptions
assert_raise(TriangleError) do triangle(0, 0, 0) end
assert_raise(TriangleError) do triangle(3, 4, -5) end
assert_raise(TriangleError) do triangle(1, 1, 3) end # line 12
assert_raise(TriangleError) do triangle(2, 4, 2) end # line 13
end

These tests all pass, including two from the illegal exceptions test:

def test_isosceles_triangles_have_exactly_two_sides_equal
assert_equal :isosceles, triangle(3, 4, 4)
assert_equal :isosceles, triangle(4, 3, 4)
assert_equal :isosceles, triangle(4, 4, 3)
assert_equal :isosceles, triangle(10, 10, 2)

assert_equal :isosceles, triangle(1, 1, 3) # these two are from
assert_equal :isosceles, triangle(2, 4, 2) #

test_illegal_triangles_throw_exceptions
end

Here’s the triangle method:

def triangle(a, b, c)
if (a <= 0) || (b <= 0) || (c <= 0)
raise TriangleError
end
case
when (a == b) && (a == c) && (b == c)
return :equilateral
when (b == c) || (a == c) || (a == b)
return :isosceles
when !(a == b) && !(a == c) && !(b == c)
return :scalene
else
return :oops!
end
end

On Jul 4, 12:08pm, Robert K. [email protected] wrote:

as keys and not Strings since OPT_TABLE is completely used internally
only (contains valid options of methods).

Btw, if at all conversion between the two should be employed I’d do it
as part of input processing. This means, typically, if you have a
configuration file which may select a few items then a conversion from
String to Symbol would be part of the process which reads the
configuration file. Internally I would make the program always work
with Symbols.

Thanks robert (et al.). Makes perfect sense. If my current patch is
accepted I will make a new one using symbols.

On Sun, Jul 3, 2011 at 5:43 PM, John F. [email protected]
wrote:

Even so it would be good to know the general criteria to consider.

Conceptually, I like to use these rules of thumb (I think Jim W.
noted it originally, but I am not sure):

1.) If the content and exact sequence of characters is the important
part, use a string.

2.) If the identity is the important part, use a symbol.

I use a different rule of thumb which has been circulated before as
well - I can’t really remember who came up with this.

  1. If the set is not fixed (typically because it depends on input
    obtained from somewhere) use String.

  2. If the set of values is fixed (typically because the program
    includes just a fixed number of keys) use Symbols.

According to that standards Fileutils should really be using Symbols
as keys and not Strings since OPT_TABLE is completely used internally
only (contains valid options of methods).

Btw, if at all conversion between the two should be employed I’d do it
as part of input processing. This means, typically, if you have a
configuration file which may select a few items then a conversion from
String to Symbol would be part of the process which reads the
configuration file. Internally I would make the program always work
with Symbols.

Kind regards

robert

The pencil test is great, and now you understand the reasoning to learn
the
generalization, the triangle inequality:

Since 1 + 1 < 3, you know that this can’t be a valid triangle.
Likewise, 2

  • 2 = 4, which is still not good enough. There are exceptions to this
    rule,
    but those exceptions are in non-Euclidean, non-spherical coordinate
    systems. You’re probably not going to encounter triangles in such a
    coordinate system unless you’re working in advanced math/physics.

TLDR: Any triangle with sides A, B and C must satisfy this property

A + B > C (you can rearrange any of the letters and this should be
true)