Dynamically generate a symbol name

cmaxvv · December 11, 2007, 12:51pm

I’m sure this is simple but i can’t work out the syntax - if i want to
use a bunch of symbols that are named :object0, :object2 … :object9,
how can i do something like

10.times do |n|
do_something_with(:object#{n} => “foo”)
end

ie, do the same thing with but pass through a differently named symbol
each time?

thanks
max

cmaxvv · December 12, 2007, 8:34pm

On Dec 11, 6:51 am, Max W. [email protected] wrote:

I’m sure this is simple but i can’t work out the syntax - if i want to
use a bunch of symbols that are named :object0, :object2 … :object9,
how can i do something like

10.times do |n|
do_something_with(:object#{n} => “foo”)
end

10.times do |n|
do_something_with(:“object#{n}” => “foo”)
end

T.

cmaxvv · December 12, 2007, 8:49pm

On Dec 11, 5:51 am, Max W. [email protected] wrote:

thanks
max

Posted viahttp://www.ruby-forum.com/.

:“object#{n}” is the syntax. Or just “object#{n}”.to_sym.

I would avoid doing this though. Symbols aren’t meant to represent
infinite variations. They’re stored in a Hash internally, so the more
symbols you add, the slower they get.

So a small finite set of symbols is good. To represent environments,
or database names, or even columns. Representing filenames as symbols,
or user_ids, or something similarly large and near as infinite as
makes no difference is a bad idea IMO.

At least if I understand the mechanics of Symbol storage. And I may
not.

cmaxvv · December 12, 2007, 9:32pm

On Dec 11, 2007, at 7:00 AM, Sam S. wrote:

:“object#{n}” is the syntax. Or just “object#{n}”.to_sym.

I would avoid doing this though. Symbols aren’t meant to represent
infinite variations. They’re stored in a Hash internally, so the more
symbols you add, the slower they get.

Hash table performance shouldn’t be affected by size, that is the
whole point of using a hash table.

The memory footprint will grow as you add symbols and
since symbols are not garbage collected in MRI you might
end up with a memory leak if you are constantly generating
new symbols.

Gary W.

cmaxvv · December 12, 2007, 9:52pm

cool, thanks guys.

cmaxvv · December 12, 2007, 10:09pm

On Dec 11, 2007 4:45 PM, Gary W. [email protected] wrote:

On Dec 11, 2007, at 7:00 AM, Sam S. wrote:

:“object#{n}” is the syntax. Or just “object#{n}”.to_sym.

I would avoid doing this though. Symbols aren’t meant to represent
infinite variations. They’re stored in a Hash internally, so the more
symbols you add, the slower they get.

Hash table performance shouldn’t be affected by size, that is the
whole point of using a hash table.

The memory footprint will increase. Unless you’re already past the
limit where each cache line on average only contains one accessed
symbol, this memory footprint increase will decrease cache locality
which again will decrease performance. In the general case: Depending
on hash table implementation, you could also have chaining costs as
your hash table grows to be full (which gives conflicts that need
either chaining, a tree, or a rebuild of the table to handle).

“Hash table performance is not affected by size” is a simplification
that hold when you have a pre-90s microprocessor or a microcontroller
and the entire hash table is in memory to start with and you have no
two entries that hash to the same slot. This is a restricted case.
Assuming some speed loss for increased size is a good assumption.

As for the point of a hash table, I tend to think of it as “Fast
access to data where the ordering property don’t matter” (or lookup is
convenient on something that isn’t a number).

Eivind.

cmaxvv · December 12, 2007, 11:22pm

On Dec 11, 9:45 am, Gary W. [email protected] wrote:

Hash table performance shouldn’t be affected by size, that is the
whole point of using a hash table.

The memory footprint will grow as you add symbols and
since symbols are not garbage collected in MRI you might
end up with a memory leak if you are constantly generating
new symbols.

Gary W.

Ahh, thanks Gary. Guess that’s my lack of a CompSci degree showing
through.

cmaxvv · December 12, 2007, 11:40pm

Gary W. wrote:

Hash table performance shouldn’t be affected by size, that is the
whole point of using a hash table.

Besides which any literal symbol references will probably be substituted
by a pointer to the hash-table element at compile time. The hash table
should only be used at compile time to provide a unique memory address.
Obviously not in the case of :"…#{}" symbols though.

Clifford H…

cmaxvv · December 12, 2007, 11:25pm

On Dec 12, 2007, at 2:29 AM, Eivind E. wrote:

The memory footprint will increase.

Yes. I said that.

“Hash table performance is not affected by size” is a simplification
that hold when you have a pre-90s microprocessor or a microcontroller
and the entire hash table is in memory to start with and you have no
two entries that hash to the same slot. This is a restricted case.
Assuming some speed loss for increased size is a good assumption.

Of course. But your observations are true of any data structure that
indexes arbitrary data and so isn’t all that helpful in understanding
what differentiates a Hash with O(1) lookup from a Tree with O(log n)
lookup from an Array with O(n) lookup.

Gary W.

cmaxvv · December 13, 2007, 1:03am

On Dec 12, 2007 9:40 AM, Gary W. [email protected] wrote:

On Dec 12, 2007, at 2:29 AM, Eivind E. wrote:

The memory footprint will increase.

Yes. I said that.

And I repeated it with information about what that memory size
increase meant, because I saw it as in conflict with what the rest of
your text communicated, especially for those that don’t know this
stuff well. More details below.

“Hash table performance is not affected by size” is a simplification
that hold when you have a pre-90s microprocessor or a microcontroller
and the entire hash table is in memory to start with and you have no
two entries that hash to the same slot. This is a restricted case.
Assuming some speed loss for increased size is a good assumption.

Of course. But your observations are true of any data structure that
indexes arbitrary data and so isn’t all that helpful in understanding
what differentiates a Hash with O(1) lookup from a Tree with O(log n)
lookup from an Array with O(n) lookup.

Absolutely agreed; my point was intended to be helpful in
understanding how real world implementations work, as I understood it
as if we were discussing a real world implementation (in particular,
how the Ruby interpreter works). The abstractions are also useful;
it’s just that they are not the whole world, and it is easy to miss
how the underlying details can make the conclusion invalid because the
abstraction breaks.

In particular, most analysis abstracts memory as if memory had
constant access time. Nowadays, that abstraction breaks a lot. An
anecdotal comparison to put it in perspective: Today, main memory has
a speed compared to the CPU that’s slower than the tape drive on a VAX
compared to the CPU of the VAX.

Another way to get some perspective: Today, a CPU generally executes
several instructions per clock cycle. Ten years ago (last time I
worked on doing cycle-by-cycle optimization outside embedded systems),
a main memory fetch ran about 70 cycles. I couldn’t find any numbers
for today - but CPU speed increase at a rate of over 1.5x per year,
while RAM speed increase at about 1.2x per year. Very conservatively,
a cache miss take the same time as executing 150 non-cache-missing
pipelined instructions.

At large scales, you’ll miss the cache anyway or not get incidental
caching, so looking at the O-time of the algorithm is sort of
sufficient. At very small scales, you’ll always hit the cache, so the
O-time is sufficient. At the intermediate scale - and I would guess
Ruby symbol table lookups often would fit in there - you get very
important cache interaction effects, including effects from whether
cache lines incidentally grab more than one item. This means that the
“hash lookup time is O(1)” claim does not hold for that case.

This was my point. I hope the longer and more elaborate variant help
somebody understand something they otherwise wouldn’t.

Eivind.

Dynamically generate a symbol name

thanks max

thanks
max