Demystifying Symbols


#1

I was actually doing really well until the strange discussion involving
throwing constants into the mix showed up, so I’m ignoring that.

Steve’s Newby Guide was excellent, if overly-complicated.

Extremely useful item (I think from Gregory, but I’m not sure):

attr_accessor "liquids", "solids"

as effective and functional as

attr_accessor :liquids, :solids

So Steve, or anybody, I’ve learned that I can :mysymbol.to_i and I get
an integer back. OK, I give up. What possible use do I have for this
zany parlor trick? I’ll hazard a guess that, as a “normal” programmer,
not doing system-level stuff, not extending Ruby, not trying to do
something dense and clever and incomprehensible…I don’t.

In fact, somebody correct me if I’m wrong, but I could actually never
use a Symbol in a single line of Ruby code and I’d still be able to get
everything done that I might reasonably want to.

Anyway, I’m quite confident I know what a Symbol is now. It’s an
immutable string.

Stop!! Put your geek away! I didn’t say it was a String. It’s a string.
It’s a Merriam-Webster “string1” 5b(2) ‘series of like objects (e.g.
characters, bits, words)’. It’s just a series of characters cemented
together, bless its little stable self. What Ruby calls a String is
actually some kind of pandimensional quasi-magic method-possessing
self-modifying Object, and not an ordinary string at all.

Note that the neologism
:symbol is to “symbol” as 1 is to “1”
makes not one iota of sense unless one already understands all the
complex ramifications of those quote marks. And given the extra magic
powers of double quotes vs. single quotes, well, ick.

A couple of people have tried to advance the idea that a Symbol is a
name. I think that’s a terrible mistake. On the one hand, here in the
real world, people will say something like “Dave is a boy’s name,” but
what that sentence really means is “The word ‘dave’ is normally used
only as a name for males.”

For me, :dave doesn’t become a name until I decide what it’s the name
of. :dave is just a boring old string, and when I say
attr_accessor :dave
what I’m actually saying is “Dear routine/module/function named
‘attr_accessor,’ please create a couple of new methods for my
Class/Module/whatever. Use that string I sent you as the name of one
which will return the value of a variable also named that, and use it
yet again to build a setter method.” That’ll get me a method named
“dave”, a method named “dave=”, and, hmm. I don’t know if “dave=” will
work if I don’t explicitly set @dave to a value at some point, although
I think it will. I’d have to check that to find out.

Anyway, that’s not the important point. Nor is the potential
“efficiency” of Symbols vs. Strings. Good grief, have you people looked
at the millisecond differences that I’ve been seeing thrown around? We
beginners couldn’t care less; we just want it to not make lots of
errors when it runs, or doesn’t run.

What I’ve learned:
A Symbol is just a modest, simple string. It’s optional; I could
always use a String instead. It’s good for naming things that are
‘inside’ my program, and that won’t need to be modified, input, output,
or generally fiddled with.

or

A Symbol is a non-variable variable (aka a constant) that always and
only contains its own name.

Seriously, as a newbie, that last sentence is all I need to know. And,
quite honestly, probably all I’ll ever need to know about Symbols.


#2

On Thu, Jan 05, 2006 at 04:25:31PM +0900, Dave H. wrote:

So Steve, or anybody, I’ve learned that I can :mysymbol.to_i and I get
an integer back. OK, I give up. What possible use do I have for this
zany parlor trick? I’ll hazard a guess that, as a “normal” programmer,
not doing system-level stuff, not extending Ruby, not trying to do
something dense and clever and incomprehensible…I don’t.

The integer value can be regarded as a side effect of the way symbols
are managed. They’re stored as values in a hash table with the integers
as keys, basically. It’s the atomic number-like value of the symbol,
and the “it has always been there” behavior[1] of it when first defined,
that the language feature is after – not the integers associated with
symbols in the hash table. If you find a good use for the .to_i
behavior, though, have at it.

At least, that’s my understanding.

In fact, somebody correct me if I’m wrong, but I could actually never
use a Symbol in a single line of Ruby code and I’d still be able to get
everything done that I might reasonably want to.

Symbols are useful for metaprogramming techniques, and can be used as
well for some minor performance increases. For most of the programming
people do, however, they’re far from necessary.

Note: My comment about symbols being useful for metaprogramming
techniques is based on my understanding of symbols in a Lisp context,
not a Ruby context. I’m still trying to get a handle on how analogous
the two are to each other – a lot, clearly, but the devil’s in the
details (so to speak). I’m nothing like a Ruby expert, and there’s a
lot of context to learn before I can be sure about those details.

Stop!! Put your geek away! I didn’t say it was a String. It’s a string.
It’s a Merriam-Webster “string1” 5b(2) ‘series of like objects (e.g.
characters, bits, words)’. It’s just a series of characters cemented
together, bless its little stable self. What Ruby calls a String is
actually some kind of pandimensional quasi-magic method-possessing
self-modifying Object, and not an ordinary string at all.

Note that the neologism
:symbol is to “symbol” as 1 is to “1”

For purposes of my analogy to that effect a few days ago, that could as
easily have been single quotes as double quotes, in case there’s any
confusion on the matter. My point was the relative characteristics of a
Ruby string literal as compared with numbers and symbols.

For me, :dave doesn’t become a name until I decide what it’s the name
of. :dave is just a boring old string, and when I say
attr_accessor :dave

This makes me want to draw analogies between Ruby symbols and spoken
language phonemes, which while interesting and in some respects accurate
would probably not be helpful to many people.

A Symbol is a non-variable variable (aka a constant) that always and
only contains its own name.

Seriously, as a newbie, that last sentence is all I need to know. And,
quite honestly, probably all I’ll ever need to know about Symbols.

That depends on just how advanced your programming gets. Programs that
write themselves are nothing to sneeze at. One of these days, I’ll be
at a point where I can actually take advantage of that sort of thing –
but I’m not there yet.


Chad P. [ CCD CopyWrite | http://ccd.apotheon.org ]

This sig for rent: a Signify v1.14 production from
http://www.debian.org/


#3

Hi –

On Thu, 5 Jan 2006, Dave H. wrote:

A Symbol is just a modest, simple string. It’s optional; I could
always use a String instead.

That depends. If you’re calling a method that insists on a Symbol
argument, you have to use a Symbol. Many methods are written to
accept either, though.

It’s good for naming things that are ‘inside’ my program, and that
won’t need to be modified, input, output, or generally fiddled with.

Since the “is a string” thing has resonances with classes and
hierarchies and so on, it’s also possibly helpful to think of it as:
Ruby has two ways of representing text: String and Symbol. Or
something like that.

or

A Symbol is a non-variable variable (aka a constant) that always and only
contains its own name.

Well, Ruby has variables and constants, and symbols aren’t either of
them :slight_smile: I guess they’re constant, informally, but they’re not
constants, as defined by the language. I wouldn’t call them variables
in any sense.

David


David A. Black
removed_email_address@domain.invalid

“Ruby for Rails”, from Manning Publications, coming April 2006!


#4

On Thu, Jan 05, 2006 at 09:54:35PM +0900, removed_email_address@domain.invalid wrote:

Well, Ruby has variables and constants, and symbols aren’t either of
them :slight_smile: I guess they’re constant, informally, but they’re not
constants, as defined by the language. I wouldn’t call them variables
in any sense.

I guess that, if you want to compare them to “variables” (things that
you create, define, and change at whim) and “constants” (things that you
create, define, and – hopefully – never change), you could think of
symbols as “forevers”: from the point of view of the programmer, you
never really create or define the things, you just discover them waiting
there for you, already in the form they will always have. Heh.

There are data types, and there are data archetypes, I suppose.

That’s just another broken analogy, though, and ultimately I keep
wanting to come back to some technical details related to internal hash
table behavior.


Chad P. [ CCD CopyWrite | http://ccd.apotheon.org ]

This sig for rent: a Signify v1.14 production from
http://www.debian.org/


#5

On Thursday 05 January 2006 02:25 am, Dave H. wrote:

attr_accessor :liquids, :solids

Yeah, that was quite a breakthrough for me too. I coded it to prove it
to
myself, and from that moment on I found that I knew when to use Symbols,
and
what I could accomplish by using them.

So Steve, or anybody, I’ve learned that I can :mysymbol.to_i and I get
an integer back. OK, I give up. What possible use do I have for this
zany parlor trick? I’ll hazard a guess that, as a “normal” programmer,
not doing system-level stuff, not extending Ruby, not trying to do
something dense and clever and incomprehensible…I don’t.

Hi Dave,

I personally know of no use, in my application as opposed to Ruby
internals,
for the integer representation of a symbol.

In fact, somebody correct me if I’m wrong, but I could actually never
use a Symbol in a single line of Ruby code and I’d still be able to get
everything done that I might reasonably want to.

Except for giving up symbols’ memory advantages and (probably slight)
performance advantages, that’s my understanding also.

self-modifying Object, and not an ordinary string at all.
I don’t think it would hurt to think of it as an immutable string (not a
String), in your own personal life. However, that probably would not go
over
well on a mailing list :slight_smile:

Note that the neologism

:symbol is to “symbol” as 1 is to “1”

makes not one iota of sense unless one already understands all the
complex ramifications of those quote marks. And given the extra magic
powers of double quotes vs. single quotes, well, ick.

I never understood that analogy.

[clip]

A Symbol is a non-variable variable (aka a constant) that always and
only contains its own name.

That’s an interesting and concise statement. I’ll have to think about
that.

SteveT

Steve L.
http://www.troubleshooters.com
removed_email_address@domain.invalid


#6

Yet another try …

A symbol is an object with a string name. No two symbols (with different
object_id) can have the same name string.

The literal
:x
evaluates to a symbol with name “x”, creating the symbol if necessary.

“Dave H.” removed_email_address@domain.invalid wrote in message
news:removed_email_address@domain.invalid…


#7

On Thursday 05 January 2006 03:55 pm, Steve L. wrote:

as keys, basically. It’s the atomic number-like value of the symbol,
and the “it has always been there” behavior[1] of it when first defined,
that the language feature is after – not the integers associated with
symbols in the hash table. If you find a good use for the .to_i
behavior, though, have at it.

At least, that’s my understanding.

Internals question: Curious – why don’t they hash with the object id
instead?

Oh never mind – a more careful reading of Evan’s internals explanation
makes
it clear.

SteveT

Steve L.
http://www.troubleshooters.com
removed_email_address@domain.invalid


#8

On Thursday 05 January 2006 03:13 am, Chad P. wrote:

and the “it has always been there” behavior[1] of it when first defined,
that the language feature is after – not the integers associated with
symbols in the hash table. If you find a good use for the .to_i
behavior, though, have at it.

At least, that’s my understanding.

Internals question: Curious – why don’t they hash with the object id
instead?

SteveT

Steve L.
http://www.troubleshooters.com
removed_email_address@domain.invalid


#9

Dave H. wrote:
[…]

A Symbol is a non-variable variable (aka a constant) that always and
only contains its own name.

Heh. That reminds me of the quote:

The problem with programming is that
variables don’t and constants aren’t."

Perhaps the word you are looking for is “literal”. I.e. the sequence of
characters (’:’, ‘x’, ‘y’, ‘z’) is the literal representation of the
symbol :xyz. (Just as you would say the sequence of characters (‘1’,
‘2’, ‘3’) is the literal presentation of the number 123).


– Jim W.


#10

On Jan 5, 2006, at 12:54, Steve L. wrote:

I personally know of no use, in my application as opposed to Ruby
internals,
for the integer representation of a symbol.

Exactly. So if I’m going to try explaining Symbols to somebody who’s
still trying to learn Ruby, I would completely omit any mention of the
fact you could .to_i a Symbol.

In fact, somebody correct me if I’m wrong, but I could actually never
use a Symbol in a single line of Ruby code and I’d still be able to
get
everything done that I might reasonably want to.

Except for giving up symbols’ memory advantages and (probably slight)
performance advantages, that’s my understanding also.

I’m a newby; I’m not to the point that I care about memory advantages.
I have a gigabyte to burn. Performance advantages appear to be nearly
indetectable. Long before Symbols make a difference, I probably ought
to learn many other far more relevant programming tricks to get more
speed.

I don’t think it would hurt to think of it as an immutable string (not
a
String), in your own personal life. However, that probably would not
go over
well on a mailing list :slight_smile:

Yes, well, the fact that this mailing list was full of the most
horrific dis-explanations of Symbols is one of the things that prompted
me to not keep my thoughts to myself.

[clip]

A Symbol is a non-variable variable (aka a constant) that always and
only contains its own name.

That’s an interesting and concise statement. I’ll have to think about
that.

Somebody else suggested thinking of them as “forevers,” which is pretty
cute.

A variable is a box. You can put a sign on the box, like “current_user”
or “amountPaid,” and then put something in the box, put something else
in, and on and on. (Yes, in Ruby one actually puts references in the
box, not things. That is also a layer of complication inappropriate in
an explanation to a newbie, although it can’t be put off for long.)

A constant is a wooden crate. You take something, build the crate
around it, then stick the sign on the outside. Put a watermelon in a
crate. Label it “oranges” if you like, but you can’t put something else
in once you’ve built it.

A symbol is a crate with no label, but a clear cover. What you see is
what you get; the label is also the contents.

To restate the one-liner above:

A Symbol is an unalterable (*) that always and only contains its own
name.

Damn. I can’t find any word besides “variable” to go in that spot.
“Container” is the functional description, and in programming, the
first kind of container you meet (in every language I know) is called a
variable, unless you learned Assembly first, and then it’d be
“register.” I can’t support “literal,” since as a programming term, I
think it’s quite obscure, which means I have to default to the general
English meaning, and it’s not remotely literal in that sense.
Otherwise, I’d be able to sit down on a :chair. :slight_smile:

P. S. With duck-typing as a Virtue in Rubyville, when am I ever going
to be presented with a method that won’t accept a String as well as a
Symbol? Would there ever be a reason besides “bad programming” to be
that restrictive?

P. P. S. The “contains itself” aspect of Symbols reminds me of Rexx. In
Rexx, you can just use a variable without declaring it first, like
Ruby. Unlike Ruby, a “new” Rexx variable doesn’t contain “nil”. It’s
initialized to its own name as a string. This would occasionally result
in the most peculiar error messages.


#11

Dave H. wrote:

A variable is a box.

Hmmm … see http://onestepback.org/index.cgi/Tech/Ruby/Shoeboxes.rdoc
for an alternate view. I think the mental model of “boxes” as variables
doesn’t work so well for Ruby.


#12

On Fri, 06 Jan 2006 14:40:36 -0000, Dave H.
removed_email_address@domain.invalid
wrote:

P. S. With duck-typing as a Virtue in Rubyville, when am I ever going to
be presented with a method that won’t accept a String as well as a
Symbol? Would there ever be a reason besides “bad programming” to be
that restrictive?

Here’s one:

f = File.read(:"/etc/passwd")

TypeError: can't convert Symbol into String
   	  from (irb):1:in `read'
        from (irb):1

I guess it doesn’t always ‘make sense’, and that it’s just about not
bothering to coerce arguments to strings when that’s the case.


#13

On 1/5/06, Dave H. removed_email_address@domain.invalid wrote:

Note that the neologism
:symbol is to “symbol” as 1 is to “1”
makes not one iota of sense unless one already understands all the
complex ramifications of those quote marks.

This is why writing a decent tutorial is so hard. Unless you really
talk to lots and lots of newbies, it’s so easy to say totally useless
(well, useless in this context) things like this.

(Don’t get me wrong: I actually found it a delightful analogy!
Brilliant, in fact! But I’ve been using symbols happily for years.
An analogy like this feels more like an inside joke than an
explanation. :slight_smile:

Chris


#14

I personally know of no use, in my application as opposed to Ruby
internals,
for the integer representation of a symbol.

How about logging method names in a human readable manner using id2name
?

 method_missing(id, *args)

   id: method symbol
   *args: method arguments

#15

Steve L. wrote:

as effective and functional as

attr_accessor :liquids, :solids

That’s true, but only because attr_accessor can convert the Strings it
is passed to Symbols. Internally, it’s using Symbols.

Pretty much true.

The point is that Symbols exist in pretty much any language - they’re
part of the normal process of parsing a program. Any C compiler will
contain a definition of a type very like Symbol, and will use it for
indexing variables, functions and so on. It doesn’t make it into the
compiled program, though.

In Ruby, you can access the data structures created by the interpreter,
and it is natural to do so via the Symbol type it uses in them. It is
not essential - it can all be implicit as with

attr_accessor “liquids”
but the symbols exist, so why hide them from the programmer?

As to why you would want to use Symbol#to_i, there is basically one
reason for using it - as an array index.

def initialize; @values_array = []; end
def get_value( sym )
@values_array[sym.to_i]
end
def set_value( sym, obj )
@values_array[sym.to_i] = obj
end

That’s only an optimisation over

def initialize; @values_map = {}; end
def get_value( str )
@values_map[str]
end
def set_value( str, obj )
@values_map[str] = obj
end

but it is an optimisation, and potentially a significant one, which is
why Symbols are used internally instead of strings in the first place.

Of course, if you wanted to do the above, and symbols didn’t exist, you
could implement them yourself in 30 lines. Again, Symbol is not there
because you need it - it’s there because Ruby needs it, and you can use
it if you want.

class Cymbal
@@byname = Hash.new
@@nexti = 0

def initialize( name )

not thread-safe - needs a mutex.

@name = name
@c_id = @@nexti
@@nexti+=1
@@byname[name] = self
end

def Cymbal.get( name )
@@byname[ name ] || new( name )
end

def Cymbal.all_cymbals
@@byname.values
end

def to_s ; @name ; end

def to_i ; @c_id ; end

private_class_method :new
end

The only difference between

cym1 = :Gas

and

cym2 = Cymbal.get(“Gas”)

is that the Symbol object :Gas is created as ruby parses the cym1 line,
but the Cymbal object is only created when ruby executes the cym2 line.
This doesn’t change the real semantics, though you can prove it by
playing with Symbol#all_symbols:

s1 = Symbol.all_symbols
if nil
:aardvark
end
s2 = Symbol.all_symbols
s2 - s1
=> [:aardvark, :s2]


#16

On Jan 6, 2006, at 8:21 AM, Jim W. wrote:

Dave H. wrote:

A variable is a box.

Hmmm … see http://onestepback.org/index.cgi/Tech/Ruby/Shoeboxes.rdoc
for an alternate view. I think the mental model of “boxes” as
variables
doesn’t work so well for Ruby.

Good point in your article - shoeboxes (or pigeonholes, as I
originally learned variables) are bad metaphors in a dynamically-
typed language.

Saying “bound” or “named” is right, but doesn’t provide the same
visual mental model for beginners.

I like to explain it that variables are nametags that you hang on
things.

Assignment places a nametag on an object.
person = ‘Gavin’
# Create a new box with the letters ‘Gavin’ in it;
# hang on it nametag with ‘person’ scrawled on it

me = person
# Find the box with the 'person' nametag on it,
# hang another nametag on it with 'me' scrawled thereon

person = nil
# Take the person nametag off the box,
# and put it in the big black hole of tags without boxes

me = nil
# Do the same thing with the me nametag

Periodically, the Cosmic Janitor comes through with his rolling
garbage can. Any boxes that don’t have name tags get thrown away,
leaving more room on the floor for you.

This metaphor makes it a little bit confusing explaining Arrays and
Hashes to new users of a dynamic language. At that point, I usually
ditch the nametag metaphor and start using fishing line that goes
between the nametag and the boxes. Arrays, then, are boxes with a
bunch of nails on them, with 0…n drawn on the heads. The fishing
line gets tied between a nail and a certain box, and the Janitor’s
broom sweeps away anything that isn’t tied to something else.

The benefit of the fishing line example is that it also helps with
scope. When I’m standing in one spot of the room, I can only see the
nametags on the floor near me, even though the fishing line on them
runs over to another box.


#17

Hi –

On Sun, 8 Jan 2006, Dave H. wrote:

that I’d be really perplexed.
A number of methods in the various Rails libraries only work with
symbol arguments. For example, you cannot do:

SomeModel.find(“all”)

or

class SomeModel < ActiveRecord::Base
has_many “others”
end

David


David A. Black
removed_email_address@domain.invalid

“Ruby for Rails”, from Manning Publications, coming April 2006!


#18

On Jan 6, 2006, at 8:48, Ross B. wrote:

f = File.read(:"/etc/passwd")

TypeError: can’t convert Symbol into String

Hmm. But that’s the opposite of what I said. I’m not too dismayed if
something says “no, no, I don’t want a Symbol. I want an honest-to-gosh
String.” It’s if it says “No, a String is unacceptable. I only take
Symbols” that I’d be really perplexed.


#19

On Sat, Jan 07, 2006 at 12:57:21PM +0900, Chris P. wrote:

(Don’t get me wrong: I actually found it a delightful analogy!
Brilliant, in fact! But I’ve been using symbols happily for years.
An analogy like this feels more like an inside joke than an
explanation. :slight_smile:

I tend to assume one knows a little something about strings and variable
assignments and arithmetic operators and integers before one gets around
to learning symbols. As such, it’s probably not quite that useless.

At least, that’s what I’d guess would be the case if you’re learning in
anything approaching a structured manner, from tutorials, books,
classes, howtos, or whatever. Maybe a mailing list doesn’t count.


Chad P. [ CCD CopyWrite | http://ccd.apotheon.org ]

This sig for rent: a Signify v1.14 production from
http://www.debian.org/


#20

I think the keychain analogy has some possibilities.

On Jan 7, 2006, at 7:40 PM, Dave H. wrote:

Ruby make a new keychain tagged “sillyNum.” There isn’t a key on
this yet. That’s what “nil” is; a keychain without a key.

No. The new keychain has a single key on it to the mailbox
containing the object known as nil.

sillyNum = population + 14 / “5”.to_i

How about something a little simpler:

a = b + 1

Ruby finds the mailbox that matches the key on the keychain labeled b.

There is a slot on the side of the object inside the mailbox.
Ruby makes a copy of the key on the :+ keychain and inserts it in the
slot.
Ruby makes a copy of the key on the 1 keychain and inserts it in the
slot.
Ruby then presses a button next to the slot. The button is labeled
‘send’.
Ruby waits a bit and then a new key clanks as it falls into a bin
labeled ‘return value’.
Ruby attaches the new key to the keychain labeled a, discarding any
key that was there before.

Gary W.