Symbols garbage collector in Ruby1.9, fixed?

On Thu, Apr 2, 2009 at 12:17 AM, Rob B.
[email protected]wrote:

but then you might have to “worry” about having both :to and ‘to’ as keys.

Symbols are only faster because they are immutable and don’t get garbage
collected. But I’d go with Tony and just use String all the time.

Actually, I’m pretty sure that Symbols are faster as hash keys because
Hash#== is O(1) while String#== is O(n) where n is the length of the
string.
That said, the HashWithIndifferentAccess class in activesupport allows
either strings or symbols to be used interchangeably as the key argument
in
methods like [] and []=, but it always USES the string form as the key.

I was quite surprised when I discovered this, since I’d assumed that the
reason for using symbols was for the speed advantage, but it was finally
pointed out to me the problem of “memory leaks” when arbitrary keys get
interned as symbols.

But I do in general prefer the look in source code of

:id => 3

rather than

‘id’ = > 3

And when the symbols come in the source like this there’s less chance of
arbitrary growth of interned symbols.


Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale

Iñaki Baz C. wrote:

headers = { :from => “alice@qweeq”, ":to => "bob@qweqwe }

I’d figure out what very common headers are and make them freezed
constants,
like:

FROM = “From”.freeze
TO = “To”.freeze

and put references to those string “constants” as keys into the Hash. I
assume that this will be as fast as symbols when accessing the hash with
those constants, as equality testing just needs to tests for object
identity
(object_id) and not for the equality of the content.

headers = {}
headers[FROM] = “alice@qweeq”
headers[TO] = “bob@qweqwe”

p headers[TO]
p headers[“To”] # works as well, but should be slower

Would you like to benchmark this against using symbols?

Btw, this is the approach that for example Mongrel uses.

Regards,

Michael

2009/4/2 Rick DeNatale [email protected]:

Actually, I’m pretty sure that Symbols are faster as hash keys because
Hash#== is O(1) while String#== is O(n) where n is the length of the string.
That said, the HashWithIndifferentAccess class in activesupport  allows
either strings or symbols to be used interchangeably as the key argument in
methods like [] and []=, but it always USES the string form as the key.

Oh, then it’s better just to use strings, I don’t need to support
string and symbols at the same time, I just need to make a decission.

Thanks for pointing it out.

On Apr 1, 2009, at 8:03 PM, Iñaki Baz C. wrote:

server will be
accessing some headers to read their content. But since it’s just in
a very
early stage I cannot sure it.

Thanks.

Iñaki Baz C. [email protected]

Just key the hash with Strings:
headers = { ‘from’ => “alice@qweeq”, ‘to’ => “bob@qweeq” }

If you really need to use symbols, perhaps add methods to a subclass
of Hash like the HashWithIndifferentAccess from Rails which mostly
eliminates the need to care whether you actually stored against a
Symbol or a String key.

There’s also nothing stopping you from having both kinds of keys at
once:
headers = { :from => “alice@qweeq”, :to => “bob@qweeq”, ‘snack’ =>
“raisins” }

but then you might have to “worry” about having both :to and ‘to’ as
keys.

Symbols are only faster because they are immutable and don’t get
garbage collected. But I’d go with Tony and just use String all the
time.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

2009/4/2 Michael N. [email protected]:

 headers = {}
 headers[FROM] = “alice@qweeq”
 headers[TO] = “bob@qweqwe”

Very interesting solution, but I would have some issues with it:

a) I receive a request with various headers, most of them are well
kwnown but others can be custom.
When I extract the header name (after parsing) I get “From” and
“Custom-Header” strings, and I need to check if these strings belongs
to well known headers or not before storing them as FROM and
“Custom-Header”. Wouldn’t this check be inneficient?

b) Some wellk wnown headers have a name like “Record-Route”. The “-”
symbol is of course dissallowed as Ruby Constant. Using Symbols I can
use it as :“record-route”.

Well, I have to think about it. Thanks a lot for all the received help.
Regards.

Btw, this is the approach that for example Mongrel uses.

Then I must investigate how it handles case b).

2009/4/2 Michael N. [email protected]:

  headers[ KNOWN_HEADERS[key] || key ] = value
 end

This seems a wonderful solution :slight_smile:

Thanks a lot.

Iñaki Baz C. wrote:

identity (object_id) and not for the equality of the content.
“Custom-Header” strings, and I need to check if these strings belongs
to well known headers or not before storing them as FROM and
“Custom-Header”. Wouldn’t this check be inneficient?

No!

time ruby -e “s,t=‘a’*100,‘a’*100;1_000_000.times{s==t}”
0.567u 0.000s 0:00.58 96.5% 5+1563k 0+0io 0pf+0w

Comparing 1 million strings of size 100 is just half a second in the
worst
case (of which around the half is just method calling overhead!).

If you take more reasonable sized strings (15 characters):

time ruby -e “s,t=‘a’*15,‘a’*15;1_000_000.times{s==t}”
0.326u 0.007s 0:00.33 96.9% 5+1588k 0+0io 0pf+0w

Compared against object id comparison (notice “s == s”):

time ruby -e “s,t=‘a’*15,‘a’*15;1_000_000.times{s==s}”
0.265u 0.000s 0:00.26 100.0% 5+1595k 0+0io 0pf+0w

So, I wouldn’t call Ruby strings inefficient. Not the lookup is in
general
the problem with performance, but the memory allocation. Even if string
comparison is wc. O(n), a key lookup of a hash is in general O(1)
regardless
of strings or symbols as keys (especially as the length of the keys is
usually limited).

I don’t think that this lookup will be significant. If it is significant
then you’re probably using the wrong language :).

b) Some wellk wnown headers have a name like “Record-Route”. The “-”
symbol is of course dissallowed as Ruby Constant. Using Symbols I can
use it as :“record-route”.

I didn’t meant constants, but “constant”, i.e. frozen, values.

FROM = ‘From’.freeze
RECORD_ROUTE = ‘Record-Route’.freeze

KNOWN_HEADERS = {
FROM => FROM,
RECORD_ROUTE => RECORD_ROUTE
}

headers = {}
for key, value in h
headers[ KNOWN_HEADERS[key] || key ] = value
end

Regards,

Michael