Why was the "Symbol is a String"-idea dropped?

Robert K. schrieb:

On 15.05.2007 03:07, enduro wrote:

Or, in other words: if the decision to unify Symbol and String would
have been taken at early stages of Ruby development, then the
general usage would have adapted to this, and …
we might be happier with the result today.

I am in no way unhappy with the way it is today.
Strings and symbols serve different purposes although there is some
overlap.
I rarely feel the need to convert between the two.

I see.
And I am quite surprised. Because judging from your online activity
you seem to have some experience.
Perhaps it is also my programming style: I may use symbols where one
normally would use strings.

I am not aware of a situation where you would need to mix them as hash
keys.
And to make the distinction is pretty easy most of the time IMHO.

Not aware? I mean Rails mixes them, right?

Frankly, I believe there is an inherent advantage that you can use
symbols vs. strings in code.
And I mean not only performance wise but also readability wise.

Readability-wise: precisely what advantage?
The only thing that comes to my mind just now, is
that a separated Symbol class easily provides
distinct special values for a parameter that would normally carry a
String.

Note though, that all these issues have nothing to do with the question
whether String and Symbol should be connected inheritance wise.
IMHO that’s mostly an implementation decision in Ruby.

Yes, I agree.
I am actually interested in the implications for the programmer.
My original question just arised out of the notion
that this implementation decision could have been a move
in a (to my mind) favourable direction.

Yes, I sometimes think of that separation of Symbol from String
as a tiny impurity in the Ruby crystal.

Personally I believe it creates more expressiveness.
If you view this as impurity, there are a lot of them in Ruby because
Ruby’s focus
has always been on pragmatism and not purity

  1. The core structure must of course be large enough, and a large
    structure may look impure.
  2. But regarding this particular question: My original notion was that
    keeping
    Symbol and String too separate is not pragmatic.
    (I may change my mind on that, if I read more posts like yours,
    though.)

So, I’ll just have to come to terms with it. :slight_smile:
(And I will, of course – there are enough other fascinating
issues… :slight_smile: )

The capability to adjust to reality is a useful one IMHO. :slight_smile:

Well, yes, sometimes I’m glad someone tells me that. :slight_smile:

create a class hierarchy similar to the Float/Integer hierarchy?
String < Stringlike
Symbol < Stringlike

Why not? StringLike could even be a module that relies solely on []
and length to do all the non mutating stuff.

Ah, interesting. Can’t follow the implications right now.

Given the fact that I don’t mix symbols and strings as Hash keys I
wouldn’t benefit -
but it would not hurt me either. :slight_smile: YMMV

Yes that was the idea behind it: to benefit some and not to hurt the
others.

Credits also go to the community that is still among the most
civilized online communities I know so far!

Indeed, I’m experiencing it right now!
Thanks a lot!

Sven

Ooops!

sorry if I came across rude in any way.

I don’t want to “own” the thread.
But I am interested in my question,
so I was glad that someone repeated it,
at a time when all the answers up to that point had not yet answered it.

Robert D. schrieb:

The fact that the original idea is a big paradigm shift does not
answer your question?

Sorry, no. If someone had told me that this fact was the basis for the
decision of the core team,
that would have answered my question.
(Because the fact alone is not compelling: If a paradigm shift is
possible and good then why not shift?)

And also, I thought that this was the right place for posting the
question.
(Actually, until yesterday I didn’t know that I could post on ruby-core,
I thought it was just for “cracks”, because it’s read-only on
ruby-forum.com)

Kind Regards
Sven

And here again, Robert D.'s full text:

only you might post to ruby-core only.
I honestly do not understand that.

Threads just evolve I do not feel that they belong to OP :).
They do not belong to me either of course ;).
Cheers
Robert

Another question:
Who is

YHS

?

Regards, Sven

On 15.05.2007 12:31, enduro (Sven S.) wrote:

serve different purposes although there is some overlap. I rarely feel
the need to convert between the two.

I see.
And I am quite surprised. Because judging from your online activity
you seem to have some experience.
Perhaps it is also my programming style: I may use symbols where one
normally would use strings.

Yeah, maybe. So where are you using symbols where one normally would
use strings?

I am not aware of a situation where you would need to mix them as hash
keys. And to make the distinction is pretty easy most of the time IMHO.

Not aware? I mean Rails mixes them, right?

I don’t use Rails. :-)))

Frankly, I believe there is an inherent advantage that you can use
symbols vs. strings in code. And I mean not only performance wise but
also readability wise.

Readability-wise: precisely what advantage?

If I see a symbol being used as a Hash key I immediately know (or rather
guess) that there is only a limited amount of them and they are known
beforehand, like with options.

silly example

opts = {
:length => 12,
:width => 30,
}

other code

resize( opts[:length] )

Whereas when strings are used it’s typically stuff that is read from
somewhere, like (another silly example):

ruby -aF: -ne ‘BEGIN { $c=Hash.new(0) }; $c[$F[1]]+=1; END { $c.each
{|k,v| print k, “=”, v, “\n”}}’ /etc/passwd

The only thing that comes to my mind just now, is
that a separated Symbol class easily provides
distinct special values for a parameter that would normally carry a String.

Don’t forget the optical distinction between using ‘string’, “string”
and :symbol.

Note though, that all these issues have nothing to do with the question
whether String and Symbol should be connected inheritance wise. IMHO
that’s mostly an implementation decision in Ruby.

Yes, I agree.
I am actually interested in the implications for the programmer.
My original question just arised out of the notion
that this implementation decision could have been a move
in a (to my mind) favourable direction.

As we all have different habits what may be favorable for one may be
regrettable for the other. :slight_smile:

Yes, I sometimes think of that separation of Symbol from String
as a tiny impurity in the Ruby crystal.

Personally I believe it creates more expressiveness. If you view this
as impurity, there are a lot of them in Ruby because Ruby’s focus
has always been on pragmatism and not purity

  1. The core structure must of course be large enough, and a large
    structure may look impure.

This somehow reminds me of

  1. But regarding this particular question: My original notion was that
    keeping
    Symbol and String too separate is not pragmatic.
    (I may change my mind on that, if I read more posts like yours,
    though.)

Just reread mine a few times - then you don’t need the other postings
any more. That’s more efficient - you’ll save bandwidth and reading is
actually faster if you know the text already. :-))

So, I’ll just have to come to terms with it. :slight_smile:
(And I will, of course – there are enough other fascinating
issues… :slight_smile: )

The capability to adjust to reality is a useful one IMHO. :slight_smile:

Well, yes, sometimes I’m glad someone tells me that. :slight_smile:

:-)) No sweat - following visions is useful as well. As always it’s
the mix…

create a class hierarchy similar to the Float/Integer hierarchy?
String < Stringlike
Symbol < Stringlike

Why not? StringLike could even be a module that relies solely on []
and length to do all the non mutating stuff.

Ah, interesting. Can’t follow the implications right now.

For example regexp matching might be implemented similarly for both
(i.e. just in one place). But then again, since RX functionality is
highly integrated into the language that might not be a good idea - or
the C code needs to become more complex to react differently if it sees
a String or Symbol vs. some custom class that includes this module.
Hm…

Given the fact that I don’t mix symbols and strings as Hash keys I
wouldn’t benefit -
but it would not hurt me either. :slight_smile: YMMV

Yes that was the idea behind it: to benefit some and not to hurt the
others.

The next best thing to a win win situation. :-))

Credits also go to the community that is still among the most
civilized online communities I know so far!

Indeed, I’m experiencing it right now!
Thanks a lot!

You’re welcome. Thank /you/!

Kind regards

robert

Hello again,

Robert K. schrieb:

On 15.05.2007 12:31, enduro (Sven S.) wrote:

Perhaps it is also my programming style: I may use symbols where one
normally would use strings.

Yeah, maybe. So where are you using symbols where one normally would
use strings?

Let me guess, because I don’t know if I am really the only one:

  1. Multipurpose-names:
    Like option-names, used as hash keys but also as names and labels for
    the corresponding graphics control etc.
  2. Logging:
    Giving a brief hint in the form of a symbol (not the log level), well
    just because it is easier to type and looks nice

Not aware? I mean Rails mixes them, right?

I don’t use Rails. :-)))

Oops :-), offending agian, am I? :slight_smile:
:slight_smile:

like with options.

silly example

opts = {
:length => 12,
:width => 30,
}

other code

resize( opts[:length] )

Sorry, don’t get me wrong:

I DID NOT MEAN TO REMOVE the Symbol class.
Nor Symbol literals.

Thus, your examples would be valid and semantically equivalent code
after a “unification” of the classes (regardless if Symbol < String or
not).
Or I’d better not call it “unification”, I don’t have a good word,
perhaps “joining” would be better.

Don’t forget the optical distinction between using ‘string’, “string”
and :symbol.

Also, this won’t be affected, see above.

[…] on pragmatism and not purity

  1. The core structure must of course be large enough, and a large
    structure may look impure.

This somehow reminds me of
Gödel's incompleteness theorems - Wikipedia

… mystery will always remain …

  1. But regarding this particular question: My original notion was
    that keeping
    Symbol and String too separate is not pragmatic.
    (I may change my mind on that, if I read more posts like yours,
    though.)

Just reread mine a few times - then you don’t need the other postings
any more.
That’s more efficient - you’ll save bandwidth and reading is actually
faster if you know the text already. :-))

Well,
as Ruby-users,
we don’t sacrifice our fun to the god of efficiency, do we… :slight_smile:

Cheers,
Sven

On 5/15/07, enduro (Sven S.) [email protected] wrote:

The fact that the original idea is a big paradigm shift does not
answer your question?

Sorry, no. If someone had told me that this fact was the basis for the
decision of the core team,
that would have answered my question.
(Because the fact alone is not compelling: If a paradigm shift is
possible and good then why not shift?)

Sure that was exactly the thing I wanted to discuss and suddenly
someone told me hey stay On Topic. That was strange but not rude at
all. I mean neither Xavier nor you, you are very civilized and polite
people- maybe much more than YHS :wink:
I just had the feeling that the answers you will get on this list will
never correspond to your exact question, and I was wrong as Matz
stepped by.

I admit that personally I have a big problem with “A symbol is a
string”, but brighter people than me like Tom and Matz have not or did
not have, so maybe indeed I am making too much noise while thinking
:(.

But please remember too that there are only complicated answers to
simple questions ;).

And also, I thought that this was the right place for posting the question.
(Actually, until yesterday I didn’t know that I could post on ruby-core,
I thought it was just for “cracks”, because it’s read-only on
ruby-forum.com)
I definitely should have pointed that out first and than I could have
taken all the time to rant/argue/discuss the technical points, oh boy
how difficult communication can be sometimes!

Kind Regards
Sven

Cheers
Robert

On Tue, May 15, 2007 at 06:42:04PM +0900, enduro (Sven S.) wrote:

And you also open yourself to a symbol exhaustion denial-of-service.

Yes, of course.
But my point is: Let the system take care of that.
I want a Ruby that just works - crystal-clear, transparently, reliably.
:slight_smile:
And it already does in most cases. And there is a lot that can be improved.
And one such improvements could be a garbage collection for symbols. (I
think.)

But then what you want are not symbols, but true immutable strings. By
that
I mean: some object where I can write 10MB of binary dump. If I want to
add
one character to the end of it, then I create another object containing
10MB+1byte of binary dump, and the old 10MB object is garbage-collected.

Now, there have been arguments that all strings in Ruby should have
been
immutable in the first place, and I can sympathise with them. After all,
numbers are immutable, and so are certain other classes. But
pragmatically,
there are cases where it is just so useful to append to a string.
Besides,
maintaining the singleton property is hard for large binary objects -
i.e.
when I create another 10MB binary dump, I have to check whether it’s the
same as any other object which already exists.

(And of course, very large numbers are Bignums, which are not
singletons)

That is, as far as I know, the symbol table is never garbage collected.
Once
a symbol, always a symbol.

I’m not a core programmer, maybe i am asking to much,
but I think it should be possible without slowing anything down.
One very simple idea I can think of, is the following:
Set a limit to the number of symbols and if it is reached
the GC wil be invoked in a special symbol-mode, marking all symbols that are
still in use and completely re-generates the symbol-table from scratch.

Yes, but why??? In real life, real world programs, only a few hundred
unique
method names are used. So let them be symbols.

If you are going to create a million different symbols, or symbols which
are
millions of bytes long, then use a String. That’s what they are there
for!

“Doctor, it hurts when I do this” – “Then don’t do that!”

What you seem to be saying is “I don’t want there to be two different
types
of object, one for method names and one for holding blobs of data”, but
I
don’t understand this. Symbols work, are fast, and personally I find
them
aesthetically pleasing: one is a sort of tag for method names, and one
is a
holder of blobs of data which may come from the outside world or from my
own
computations.

Yes, I really must admit, I also like the cleanness of current Symbols.
But then, my experience is that this clearness is not worth a lot,
because the border towards “dirty” strings must be crossed often.
(That’s why I called sticking to the clearness “temping” in my last post.)

I don’t think so. The examples I’ve seen so far are:

(1) Method names which are created algorithmically. That is, you know
you
have a method called “foo” and you want to call another method called
“foo=”. It works, where’s the problem?

send("#{mname}=")

Yes, you’ve made a conversion to a string, and back again. Big deal. The
only way to improve this would be to have symbol algebra, e.g.
(:foo + :=) == :foo=

But internally it would almost certainly be implemented the same way,
because you’d have to look up the symbol ID to convert it into its
character
representation, manipulate the characters, and then lookup back into a
symbol.

Or, you’d have to drop symbols entirely and make every method call use
a
string of characters as the method name - which would be very expensive.

Or, you’d have to make all Strings immutable, so that the the string ID
could be used as a method call tag. See above for reasons why that is
undesirable.

(2) Rails, which allows you to be inconsistent between :foo=>:bar and
:foo=>“bar” and “foo”=>:bar and “foo”=>“bar” (at least sometimes - not
always). IMO it would have been better if Rails had stuck to one or the
other, but that’s too late to undo.

Rails has introduced its own bast^H^H^H^Hextensions to the language
anyway.

Ruby is not yet good in many other aspects:
speed, threads, documentation.

There is really excellent documentation for Ruby. You have to pay for
it,
but the books I am thinking of are well worth the money.

You may not like the idea that the language designer and contributors
are
not getting any money directly for their work, whilst book publishers
are. I
can live with that.

I find that speed is good enough, and threads are better than most (have
you
tried writing threaded programs in Perl?)

The language is the crystal. It must be good in the beginning,
it becomes more solid with every project written in that language.

Many people don’t seem to realise that Ruby is, what, 15 years old now?

Regards,

Brian.

On 15.05.2007 15:54, Robert D. wrote:

class IString < String
def initialize str
super(str)
freeze
end
end

What advantages does this have over using “freeze” directly?

str = “foo”.freeze

It seems using a new class will increase the likelihood of things to
break.

HTIOI (Hope this is of interest :wink:

LOL

You see things; and you say Why?
But I dream things that never were; and I say Why not?
– George Bernard Shaw

Greetings to George, btw. :slight_smile:

robert

On 5/15/07, Brian C. [email protected] wrote:

But then what you want are not symbols, but true immutable strings. By that
I mean: some object where I can write 10MB of binary dump. If I want to add
one character to the end of it, then I create another object containing
10MB+1byte of binary dump, and the old 10MB object is garbage-collected.
But of course we have immutable strings already :)))

class IString < String
def initialize str
super(str)
freeze
end
end

HTIOI (Hope this is of interest :wink:

Cheers Robert You see things; and you say Why? But I dream things that never were; and I say Why not? -- George Bernard Shaw

Not responding to any particular posting.

One of the false memes that some folks on this thread seem to hold is
that Symbols are integers.

They aren’t.

Any more than they are strings.

A given ruby symbol has both a string and an integer representation,
which can be obtained by using the to_s, and to_i But one would’t say
that the object 1.2 is a string because it has a string
representation, or that the object “123” was an integer because it has
an integer representation.

The essential fact about symbols is that if two symbols have the same
string representation they are the same object, and that two different
symbols have two different integer representations. Or more formally

 sym1.to_s == sym2.to_s  iff sym1.object_id == sym2.object_id
 sym1.to_i == sym2.to_i iff sym1.object_id == sym2.object_id

One way to implement this is to keep internal tables which map the
string and integer representations of symbols to each other, and to
have functional mappings between the object_ids and integer
representations of symbols. This is how ruby does it. Creating a
symbol from a string consists of looking for the string in the mapping
from strings to integer representations, and if it’s not found
assigning the next integer rep and adding the string and integer rep
to the internal tables. This operation, called interning, happens
either at parse time when :foo is encountered, or later when an
expression like ‘foo’.to_sym is executed.

The meme that “Symbols are Integers” probably lingers from an earlier
version of Ruby before there was an actual Symbol class. Back then,
symbols really were instances of Fixnum, but no more. This lives on
vestigially in that Symbol does have a to_int method as well as to_i,
but to_int is deprecated, using it produces a warning :

rick@frodo:~$ ruby -w -e"p :sym.to_int"
-e:1: warning: treating Symbol as an integer
10409

while to_i does not.
rick@frodo:~$ ruby -w -e"p :sym.to_i"
10409

Other languages, like Smalltalk, with similar concepts don’t associate
integer representations with Symbols, in these languages the internal
mapping simply maps string representations to object id’s, or to the
symbol objects themselves. I suspect that this feature of Ruby symbols
is simply due to the earlier implementation.

Now what are the useful properties of Symbols:

1. Detecting whether or not two symbols are equal is as fast as

comparing their object_ids. This is an O(1) operation.
Detecting whether or not two strings are equal requires a scan of
both strings until either an unequal character is found or the end of
both strings is reached. This is an O(n) operation.
2. Having 1000 ‘instances’ of a symbol with a particular string
representation takes no more space than having 1

Property 1 means that things like hashes with symbol keys are somewhat
faster than hashes with string keys. This is why symbols are used as
method selectors, since dispatching a method call requires repeated
lookup in the method tables going up the inheritance chain. This is a
win if the key is looked up multiple times, there is an initial cost
of interning the symbol (which essentially consists of looking for the
string representation in an internal global symbol table) but this
cost is amortized over subsequent lookups.

It seems that the HashWithIndifferentAccess class added by Rails in
ActiveSupport, which allows symbols and strings to be used
interchangeably as keys, doesn’t actually take advantage of this since
it uses symbols converted to strings as the actual keys rather than
the other way around. This provides a bit of syntactic sugar, without
getting either the performance or space advantages of using symbols.

As for incompatibilies caused by the experiment, I’m not sure exactly
what Matz and the core team ran into but certainly this would break
code like:

case arg
when String
# do something
when Symbol
# do something else
end

Code like this exhibits the fragility of doing discrimination based on
classes in the face of refactoring.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On 5/15/07, Robert K. [email protected] wrote:

class IString < String
def initialize str
super(str)
freeze
end
end

What advantages does this have over using “freeze” directly?

Dunno :slight_smile:

x = IString.new(“Hello World”) # Not even tested yet
vs.
x=“HelloWorld”.freeze

Well the first one has the advantage that I thought about it :wink:

Now I reckon that the subclass stuff is baaad

def blah str
raise ArgumentError unless IString === str

end

but now someone does
class MString < IString
get rid of the freeze (by calling superclass.superclass.new in
self.class.new e.g)
end

and my code is broken, while in

def blah str
raise ArgumentError unless str.respond_to? :frozen && str.frozen?

end

frozen is frozen forever.

So do what Robert told you and beware of what Robert told you;)

But I dream things that never were; and I say Why not?
– George Bernard Shaw

Greetings to George, btw. :slight_smile:
Well last time I met him he was admiring your posts to the list :stuck_out_tongue:

    robert

idem

On 5/15/07, Rick DeNatale [email protected] wrote:

It seems that the HashWithIndifferentAccess class added by Rails in
ActiveSupport, which allows symbols and strings to be used
interchangeably as keys, doesn’t actually take advantage of this since
it uses symbols converted to strings as the actual keys rather than
the other way around. This provides a bit of syntactic sugar, without
getting either the performance or space advantages of using symbols.
Is this whole String vs. Symbol idea motivated by Rails stuff?
I just do not know Rails but I would guess it is a dangerous thing if
paradigms that are useful in an application framework - even if it is
such a Great One as Rails - are to be applied to a General Purpose
Language.

I will rephrase OP’s question now, why the h[ae]ck did the Core team
think about unifying Strings and Symbols in the first place ???
That is for sure something very interesting.

Robert

On Tue, May 15, 2007 at 10:54:05PM +0900, Robert D. wrote:

super(str)
freeze

end
end

Yes, but it’s not a singleton.

It would only be of interest as a Symbol replacement if
IString.new(“foo”)
always returned the same object. You could implement this using the
Multiton
pattern I think.

Then you could safely use IString#object_id as a method name key.

Regards,

Brian.

On Tue, May 15, 2007 at 11:53:08PM +0900, Brian C. wrote:

Yes, but it’s not a singleton.

It would only be of interest as a Symbol replacement if IString.new(“foo”)
always returned the same object. You could implement this using the Multiton
pattern I think.

Then you could safely use IString#object_id as a method name key.

P.S. I’m aware of Symbol#to_i, but to_i and object_id appear to be
intimately related:

irb(main):001:0> :foo.to_i
=> 14817
irb(main):002:0> :foo.object_id
=> 148178
irb(main):003:0> :bar.to_i
=> 16081
irb(main):004:0> :bar.object_id
=> 160818
irb(main):005:0> :zzzzzzzzzzzzzzzz.to_i
=> 16089
irb(main):006:0> :zzzzzzzzzzzzzzzz.object_id
=> 160898
irb(main):007:0> :puts.to_i
=> 7345
irb(main):008:0> :puts.object_id
=> 73458
irb(main):009:0>

i.e. I don’t think the symbol table maintains an explicit integer key
for
each symbol.

On May 15, 2007, at 10:53 AM, Brian C. wrote:

collected.
But of course we have immutable strings already :)))

class IString < String
def initialize str
super(str)
freeze
end
end

Yes, but it’s not a singleton.

You’ve stated or implied a couple of times in this discussion that
symbols are ‘singletons’, but I thought the conventional definition
of ‘singleton’ was of a class with only a single instance, where the
instance is called a singleton. That doesn’t describe Ruby’s symbols.

I think what you are getting at is the idea that identity and
equality are one and the same for symbols. Fixnum instances also
have this property but floats don’t. Is there a standard term for
that characteristic? I think in mathematics it would be an equivalence
relation ~ such that If x ~ y then x = y for all x, y in the set.
In this case ~ represents Ruby’s == and = represents Ruby’s equal?.

On 15.05.2007 16:34, Robert D. wrote:

10MB+1byte of binary dump, and the old 10MB object is
What advantages does this have over using “freeze” directly?

and my code is broken, while in

def blah str
raise ArgumentError unless str.respond_to? :frozen && str.frozen?

end

frozen is frozen forever.

Corrent. And since #frozen? is defined in Kernel you can skip the first
test.

So do what Robert told you and beware of what Robert told you;)

:slight_smile:

You see things; and you say Why?
But I dream things that never were; and I say Why not?
– George Bernard Shaw

Greetings to George, btw. :slight_smile:
Well last time I met him he was admiring your posts to the list :stuck_out_tongue:

Wow! So he didn’t die but just went home like this other guy who
invented a vi clone (or at least provided his name for the operation)…
:slight_smile:

    robert

idem

:slight_smile:

While we’re at it: if you want to define something (and are a fan of
C++) you can do this:

irb(main):001:0> module Kernel
irb(main):002:1> private
irb(main):003:1> def const(*a) a.each {|x| x.freeze } end
irb(main):004:1> end
=> nil
irb(main):005:0> nil
=> nil
irb(main):006:0> foo, bar = const “foo”, “bar”
=> [“foo”, “bar”]
irb(main):007:0> [“foo”, “bar”]
=> [“foo”, “bar”]
irb(main):008:0> foo << bar
TypeError: can’t modify frozen string
from (irb):8:in <<' from (irb):8 from :0 irb(main):009:0> bar << foo TypeError: can't modify frozen string from (irb):9:in<<’
from (irb):9
from :0
irb(main):010:0>

Hihi…

Kind regards

robert

On 5/15/07, Robert K. [email protected] wrote:

frozen is frozen forever.

Corrent. And since #frozen? is defined in Kernel you can skip the first
test.

No, you are an optimist Robert :wink:

irb(main):003:0> Kernel.send :remove_method, :frozen?
=> Kernel
irb(main):004:0> “a”.frozen?
NoMethodError: undefined method `frozen?’ for “a”:String
from (irb):4
from :0

But maybe we should not worry too much about that kind of meta-hackery
in our design, because one could trick as anyway, e.g.

class String; def frozen?; true end end

So you are right after all :wink:

Cheers
Robert

On 5/15/07, Robert D. [email protected] wrote:

On 5/15/07, Rick DeNatale [email protected] wrote:

It seems that the HashWithIndifferentAccess class added by Rails in
ActiveSupport, which allows symbols and strings to be used
interchangeably as keys,

Is this whole String vs. Symbol idea motivated by Rails stuff?

I will rephrase OP’s question now, why the h[ae]ck did the Core team
think about unifying Strings and Symbols in the first place ???

I don’t know. Probably not motivated, but on the other hand it no
doubt stimulated a reconsideration of the relationship between String
and Symbol.

Whether or not Strings and Symbols have an inheritance relationship is
a bit of an accidental design choice. Keeping in mind that in a
language like Ruby or Smalltalk, the class hierarchy is really about
implementation factoring and not type specification, as a first
approximation, it doesn’t matter that much. In Smalltalk-80 Symbol is
a subclass of String, but I believe that Symbol overrode the methods
which mutate the instance to cause errors.

But once the decision was made, secondary effects ensue. If
programmers write code which depends on a particular inheritance
relationship like the case statement in my earlier post, then changes
to the decision will break things. It’s like the story about how
Stewart Feldman decided to use tab as a lexical element in makefiles
and treat them differently from the equivalent whitespace. He
realized that this was a bad decision, but too late.
From: http://www.faqs.org/docs/artu/ch15s04.html

"No discussion of make(1) would be complete without an
acknowledgement that it includes one of the worst design botches
in the history of Unix. The use of tab characters as a required
leader
for command lines associated with a production means that the
interpretation of a makefile can change drastically on the basis of
invisible
differences in whitespace.

    Why the tab in column 1? Yacc was new, Lex was brand new. I 

hadn’t
tried either, so I figured this would be a good excuse to learn.
After
getting myself snarled up with my first stab at Lex, I just did
something
simple with the pattern newline-tab. It worked, it stayed. And
then a
few weeks later I had a user population of about a dozen, most of
them
friends, and I didn’t want to screw up my embedded base. The
rest,
sadly, is history.
– Stuart Feldman

Not that I’m saying that Matz’s decision on Symbol not being a
subclass of String was a bad one, I’m not, and it’s certainly not in
the class of the tab/whitespace ‘decision’ in make. What I am saying
is that once made these decisions can quickly generate their own
requirements to exist once a user base has been established.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On 5/15/07, Robert K. [email protected] wrote:

You see things; and you say Why?
But I dream things that never were; and I say Why not?
– George Bernard Shaw

Greetings to George, btw. :slight_smile:
Well last time I met him he was admiring your posts to the list :stuck_out_tongue:

Wow! So he didn’t die but just went home like this other guy who
invented a vi clone (or at least provided his name for the operation)… :slight_smile:

Your conclusions are jumped :wink:
But sure would have liked to talk to this guy. As to Gödel or
Hemingway, well maybe I am OT now.

While we’re at it: if you want to define something (and are a fan of
C++) you can do this:

irb(main):001:0> module Kernel
irb(main):002:1> private
irb(main):003:1> def const(*a) a.each {|x| x.freeze } end
irb(main):004:1> end

hey that is quite nice!!!

On 5/15/07, Brian C. [email protected] wrote:

=> 160818
irb(main):005:0> :zzzzzzzzzzzzzzzz.to_i
=> 16089
irb(main):006:0> :zzzzzzzzzzzzzzzz.object_id
=> 160898

Here’s part of the ruby1.8.5 code which computes an objects object_id
from its reference value.

if (TYPE(obj) == T_SYMBOL) {
return (SYM2ID(obj) * sizeof(RVALUE) + (4 << 2)) | FIXNUM_FLAG;
}

where SYM2ID is a c macro which shifts the value right 8 bits.

And here’s the code for Symbol#to_i
static VALUE
sym_to_i(sym)
VALUE sym;
{
ID id = SYM2ID(sym);

return LONG2FIX(id);

}

i.e. I don’t think the symbol table maintains an explicit integer key for
each symbol.

Actually it does, based on having recently read the ruby 1.8.5 code.

It keeps two internal hashes, one maps the string representation to
the integer representation, and the other maps the other way around.

The code for String#to_sym basically does this:

it calls rb_intern to get the integer representation called id, and 

returns
ID2SYM(id) which just returns id shifted left 8 bits, in other
words it’s the inverse of SYM2ID.

rb_intern searches for the string in the symbol table and returns
the id found there if it finds it.

otherwise, it calculates the integer representation by shifting the
next available id left by 3 bits and oring in some flag bits which
depend on the contents of the string, for example if the string starts
with a single “@” it’s flagged as an instance variable name,

It then makes a copy of the string and does the equivalent of
sym_table[stringcopy] = newly_computed_id
sym_rev_table[newly_computed_id] = stringcopy

Although these two aren’t ruby hash objects but c hash tables.

FWIW, Ruby hash object use the same c hash code internally.

What’s interesting is that a reference to a symbol doesn’t actually
point to an allocated object.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On Wed, May 16, 2007 at 12:23:09AM +0900, Gary W. wrote:

one character to the end of it, then I create another object
end
equality are one and the same for symbols.
No, that’s not exactly what I meant, but sorry for not being more
precise.
What I meant was: there is only ever one symbol object in existence for
a
particular sequence of characters. :foo.object_id in one part of the
program
is always the same as :foo.object_id elsewhere.

If it were Symbol.new(“foo”) always returning the same object then I
guess
it would probably be called the multiton pattern.

Regards,

Brian.