Ruby :symbols and C *pointers are related?


#1

Hi,

I’m still trying to grasp exactly what symbols are. I am getting a
feeling that they are related to my old friends the C pointers. Am I
getting closer?

Thanks,
Peter


#2

On 12/5/05, removed_email_address@domain.invalid removed_email_address@domain.invalid wrote:

I’m still trying to grasp exactly what symbols are. I am getting a
feeling that they are related to my old friends the C pointers. Am I
getting closer?

No. Symbols are just names–they can be considered immutable immediate
strings, but they are best considered names. There is no C
equivalent to a Symbol, as such. They can be used by certain methods
in various ways. The most common use, aside from keys on a hash, is as
the argument or arguments to a method that affects the definition of a
class or object. What is special here is not the Symbol, but the
method.

-austin


#3

On 12/5/05, removed_email_address@domain.invalid removed_email_address@domain.invalid wrote:

I’m still trying to grasp exactly what symbols are. I am getting a
feeling that they are related to my old friends the C pointers. Am I
getting closer?

Not at all. The truth can hardly be further from your understanding.
Symbols are nothing but static strings. They have the advantage that
unlike normal strings, for any given interned symbol you have only one
particular instance stored in the interpreter, so provided you don’t
have too many of them, they can save you a ton of memory. They also
look better and are a lot easier to type than actual Strings, and
they’re generally appropriate for use where you’d usually make use of
a defined constant.


#4

From: “Dido S.” removed_email_address@domain.invalid

On 12/5/05, removed_email_address@domain.invalid removed_email_address@domain.invalid wrote:

I’m still trying to grasp exactly what symbols are. I am getting a
feeling that they are related to my old friends the C pointers. Am I
getting closer?

Not at all. The truth can hardly be further from your understanding.

Except that, when testing for equality, comparing two symbols
essentially reduces to a pointer comparison.

So in that sense, symbols are indeed like pointers to strings, with
the added stipulation that lexically identical strings are KNOWN to
reside at the same memory address.

So, pseudocode:

const char *a = SYMBOL(“foo”);
const char *b = SYMBOL(“foo”);
assert( a == b ); // symbols “foo” and “foo” KNOWN to be at same
address

Hope this helps rather than further confuses… :slight_smile:

Regards,

Bill


#5

Peter asked:

I’m still trying to grasp exactly what symbols are. I am getting a
feeling that they are related to my old friends the C pointers. Am I
getting closer?

I think the closest link between C pointers and symbols is that C
strings
are C pointers, and symbols are (immutable) strings.

A symbol is just an immutable string. It’s used in a few ways that make
it
faster to use a symbol, for instance, to look up methods.

Kevin C. wrote a short article entitled Understanding Ruby Symbols:
http://glu.ttono.us/articles/2005/08/19/understanding-ruby-symbols

Cheers,
Dave


#6

Hi Peter,

On 12/8/05, removed_email_address@domain.invalid removed_email_address@domain.invalid wrote:

Could someone describe how symbols are stored in memory?

Symbols are immediate values, so they are stored in a 32 bit integer.
(Or a 64 bit integer on a 64 bit system).

Wayne


Wayne V.
No Bugs Software
“Ruby and C++ Agile Contract Programming in Silicon Valley”


#7

peter michaux wrote:

Could someone describe how symbols are stored in memory?

Something like (I don’t know if any of these are correct)

int 32 bytes
float 64 bytes
string 8 bytes for the pointer and a byte for each character including
one for “\0”

An int is the key into a table which holds the name.
Script reveals allocation as “previous + 1” sequence.

puts ’ :name object_id >>11 (i>>3)’
puts ‘-’*30
s = ‘s_00’
5.times do |n|
sym = s.next!.to_sym
syid = sym.object_id
raise if syid & 0x7ff != 0x10e
puts ’ :%s %08x %d (%d)’ % [sym, syid, syid >> 11, sym.to_i >> 3]
end

=begin
:name object_id >>11 (i>>3)

:s_01 0026710e 1230 (1230)
:s_02 0026790e 1231 (1231)
:s_03 0026810e 1232 (1232)
:s_04 0026890e 1233 (1233)
:s_05 0026910e 1234 (1234)
=end

daz


#8

On Fri, 2005-12-09 at 14:52 +0900, removed_email_address@domain.invalid wrote:

Thanks for all the responses.

Could someone describe how symbols are stored in memory?

Integers.

They’re indices into an array of unique string values, which is
accompanied by a hash table of string values to indices.

When a string is converted to a symbol, the string is looked up in the
hash table. If no entry is found, the string value is appended to the
array, and its index is recorded to the hash table. If an entry already
exists, that index (the symbol) is simply returned.

When a symbol is converted to a string, its index is looked up in the
array and that string value returned.

The values of two symbols can also be compared directly to determine
whether their associated strings are equal. This is much faster than
comparing two strings.

If you are storing the same string value multiple places, it may also
save memory to represent it with a symbol. Be careful, however. That
array never shrinks, so every unique string value you turn into a symbol
stays around until your program exits.

It’s best to reserve symbols for situations where you expect a bounded
number of unique string values.

-mental


#9

Thanks for all the responses.

Could someone describe how symbols are stored in memory?

Something like (I don’t know if any of these are correct)

int 32 bytes
float 64 bytes
string 8 bytes for the pointer and a byte for each character including
one for “\0”

Maybe something like this would make it more concrete in my brain.

Thanks,
Peter