Hash keys

I ran into something I hadn’t realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I’ve seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn’t realize it was keyed with strings so
I got errors the first time around.

So two questions:

  1. What is the preferred method of keying hashes? Symbols, strings,
    other?
  2. Is there a smooth way to handle hashes that may have been keyed in
    either fashion?

Thanks

On 2/11/08, J. Cooper [email protected] wrote:

I ran into something I hadn’t realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I’ve seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn’t realize it was keyed with strings so
I got errors the first time around.

So two questions:

  1. What is the preferred method of keying hashes? Symbols, strings,
    other?

Symbol keys
+ :a is one less keystroke than “a”
+? performance of Hash with symbol keys MIGHT be slightly
faster, but probably insignificant.
- More symbols get interned which can’t get garbage
collected, even after the hash is.

String keys

    + keys can be GCed when removed from hash, or when the hash is 

GCed.

  1. Is there a smooth way to handle hashes that may have been keyed in
    either fashion?

Rails is probably most responsible for popularizing symbol keys.
ActiveSupport implements a HashWithIndifferentAccess which can use
either symbols or strings interchangeably in access methods.
Internally it uses string keys.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On Feb 11, 2008 8:32 PM, J. Cooper [email protected] wrote:

I ran into something I hadn’t realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I’ve seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn’t realize it was keyed with strings so
I got errors the first time around.

So two questions:

  1. What is the preferred method of keying hashes? Symbols, strings,
    other?
    ----------------- 8< -----------------
    Symbols, look at this code

Symbol.send( :define_method, :to_proc ){
lambda{|x| x.send self }
} unless RUBY_VERSION === /^1.9/

string_keys = %w{a b c}
symbol_keys = string_keys.map(&:to_sym)

string_hash = Hash[ *string_keys.zip([42]*3).to_a.flatten ]
symbol_hash = Hash[ *symbol_keys.zip([42]*3).to_a.flatten ]

p [:symbol, symbol_hash]
p [:string, string_hash]
puts “So far everything looks fine”
string_hash.each_pair do |k,v| k << “…” end
------------------------ 8< ----------------------
Ruby does a fine job by freezing keys of hashes, but I prefer to use
immutable objects as keys whenever it is possible
for that very reason, in our case that favors Symbols

  1. Is there a smooth way to handle hashes that may have been keyed in
    either fashion?
    Handle them? If I pretended to be even more stupid than I actually
    believe to be I would say yes sure
    a_hash.clear :wink:

But I guess that you want to change from one to the other, let me show
you from String to Symbol
------------------------ 8< ----------------------

Do not do this at home :slight_smile:

Array.send :define_method, :each_with_index do
count = 0
inject([]){|iwi,e| count+=1; iwi << [e,count=count+1]}
end unless RUBY_VERSION === /^1.9/
string_hash = %w{A Brave New
World}.each_with_index.inject({}){|h,(v,i)| h.update v => i }
p string_hash

symbol_hash = Hash[ *string_hash.to_a.map{|k,v|[k.to_sym,v]}.flatten ]
p symbol_hash
------------------------ 8< ----------------------
HTH
Robert

Thanks

Posted via http://www.ruby-forum.com/.


http://ruby-smalltalk.blogspot.com/


Whereof one cannot speak, thereof one must be silent.
Ludwig Wittgenstein

On 11.02.2008 20:32, J. Cooper wrote:

I ran into something I hadn’t realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I’ve seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn’t realize it was keyed with strings so
I got errors the first time around.

So two questions:

  1. What is the preferred method of keying hashes? Symbols, strings,
    other?

It depends: my personal convention is this: use symbols if the set of
keys is limited and probably known beforehand; use strings if the data
is read from an external resource (e.g. a file) and there could be
arbitrary key values.

  1. Is there a smooth way to handle hashes that may have been keyed in
    either fashion?

I do not think there is a smooth one size fits all way. You could of
course convert a Hash containing on set of keys to the other one. I
don’t think it is worthwhile though and haven’t seen it so far.

Kind regards

robert

Alright, so in general if the hash is going to interact with the outside
world, I should use string keys, and it’s not worth it particularly to
worry about handling mismatch (unless I’m embarking on a Rails-sized
framework)?

I guess I had figured symbols made more sense, as a key is kinda just an
identifier and there isn’t a reason to perform string functions on it.
But I didn’t realize the deal with the GC

On Feb 11, 2008 9:51 PM, Rick DeNatale [email protected] wrote:

other?

Symbol keys
+ :a is one less keystroke than “a”
+? performance of Hash with symbol keys MIGHT be slightly
faster, but probably insignificant.
- More symbols get interned which can’t get garbage
collected, even after the hash is.
Hmm very interesting but is this not rather an implementation choice?
Which does not make the information less valuable of course, just
curious?

Cheers
Robert

Rick DeNatale wrote:

  1. What is the preferred method of keying hashes? Symbols, strings,
    String keys
    + keys can be GCed when removed from hash, or when the hash is GCed.

Another thing to look at is symbols are unique, whereas each time you
use a string literal to access the hash, you are creating a new object:

h1 = { :a => 1, :b => 2 }
h2 = { “a” => 1, “b” => 2 }

h2[“a”] # Creates a one-time use string “a”
h1[:a] #No new object created, :a already exists

But if you are creating lots and lots of symbols, then that’s lots of
unique objects being created which are not going to be garbage
collected.

Of course, strings and symbols are not the only choices for hash keys,
any object can be used. What you want may depend on the circumstances.

-Justin

On 11.02.2008 22:58, J. Cooper wrote:

Alright, so in general if the hash is going to interact with the outside
world, I should use string keys, and it’s not worth it particularly to
worry about handling mismatch (unless I’m embarking on a Rails-sized
framework)?

I am not sure what you mean by “handling mismatch”. If by mismatch you
mean access with symbols and strings: I usually do not worry about this,
because I write the code that puts data into the Hash and reads it - so
I know what happens or can control. Personally I prefer uniform access.

If by “outside world” you mean, a “data source that you do not control”
(e.g. web server logfiles, CSV data) then yes, in those cases I would
use Strings, namely the strings I read from that source.

I guess I had figured symbols made more sense, as a key is kinda just an
identifier and there isn’t a reason to perform string functions on it.
But I didn’t realize the deal with the GC

Well, if there is a limited set (e.g. states of an object like :open and
:closed for an IO stream) then it makes perfectly sense to use symbols.

Kind regards

robert

On Feb 11, 2:27 pm, Robert K. [email protected] wrote:

I do not think there is a smooth one size fits all way. You could of
course convert a Hash containing on set of keys to the other one. I
don’t think it is worthwhile though and haven’t seen it so far.

http://api.rubyonrails.org/classes/HashWithIndifferentAccess.html

http://facets.rubyforge.org/rdoc/core/classes/Hash.html#M000070

(Neither are a refutation of your statements, just throwing a few data
points into a discussion that I’m too busy to formally join at the
moment.)