Why was the "Symbol is a String"-idea dropped?


#1

Hello,

I was exited when I heard that
Symbol was made a subclass of String
in Ruby 1.9, September last year.

But then I heard that the experiment
was stopped after only two months.

And recently I have started to think about this
topic again and I’ve tried to collect the reasons
why the idea was not pursued any longer.

I have not been very lucky searching the net
for that, that’s why I am asking you:

Could someone give me a summary of the reasons
why the approach to make Symbol a subclass of String
is not considered for future Ruby versions anymore?
Or point me towards some information explaining that?

Thank you very much

Sven


#2

On Sat, May 12, 2007 at 04:20:10PM +0900, enduro wrote:

I have not been very lucky searching the net
for that, that’s why I am asking you:

Could someone give me a summary of the reasons
why the approach to make Symbol a subclass of String
is not considered for future Ruby versions anymore?
Or point me towards some information explaining that?

The two objects have very different behaviours, so why should one be a
subclass of the other?

  • Symbols are immutable, Strings are mutable
  • Symbols are singletons, Strings are not

I think this is an example of the traditional OO dilemma: “is Circle a
subclass of Oval, or is Oval a subclass of Circle?” One argument says: a
Circle is a subclass of Oval because you can use an Oval to draw a
Circle -
you just need to constrain its parameters. Another argument says: an
Oval is
a subclass of Circle because it extends the behaviour of Circle.

Ruby says: we don’t care. Make a Circle class, and make an Oval class.
Make
them both respond to whatever methods make sense (e.g. all shapes may be
expected to have a ‘draw’ method). If you want to share implementation
code
between them, then use a mixin.

Regards,

Brian.


#3

On 5/12/07, Brian C. removed_email_address@domain.invalid wrote:

On Sat, May 12, 2007 at 04:20:10PM +0900, enduro wrote:

I was exited when I heard that
Symbol was made a subclass of String
in Ruby 1.9, September last year.

But then I heard that the experiment
was stopped after only two months.

you just need to constrain its parameters. Another argument says: an Oval is
a subclass of Circle because it extends the behaviour of Circle.

More a dilemma with languages which force implementation inheritance
to track a notion of type inheritance.

Such languages assume that somehow type hierarchies are natural and
objective. In reality they aren’t.

Years ago I was discussing this with Brad Cox, and he came up with
another example. In a strongly typed OO language you might have a
hierarchy like this:

class Object
class Vehicle < Object
class Automobile < Vehicle
class Car < Automobile
class Truck < Automobile
class Ambulance < Truck

So an ambulance is a specialized truck.

But then in a new context you might want to model a ski resort and now
an ambulance can be either a toboggan, or a helicopter.

These are the kind of things which tie strongly typed frameworks in
knots of implementation tangled with type.

Ruby says: we don’t care. Make a Circle class, and make an Oval class. Make
them both respond to whatever methods make sense (e.g. all shapes may be
expected to have a ‘draw’ method). If you want to share implementation code
between them, then use a mixin.

Or in other words, languages like Ruby provide fairly rich mechanisms
for sharing implementation, and don’t tangle this up with policy
decisions about how objects should be classified, which in the real
world can become messy, or at least context/requirements dependent.

If anyone wants to ponder the difficulties of making an objective
type/classification hierarchy in more depth, I’d recommend reading the
essay “What , if Anything, is a Zebra” by Stephen Jay Gould, or for a
more in-depth and challenging read, “Women, Fire, and Dangerous
Things” by George Lakoff

Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/


#4

On May 12, 9:27 am, Brian C. removed_email_address@domain.invalid wrote:

why the idea was not pursued any longer.
subclass of the other?
Ruby says: we don’t care. Make a Circle class, and make an Oval class. Make
them both respond to whatever methods make sense (e.g. all shapes may be
expected to have a ‘draw’ method). If you want to share implementation code
between them, then use a mixin.

There are a number of advantages to sub-classing that I can think of:

  1. No need to do x.to_s.some_string_method.to_sym

  2. Hash keys could efficiently equate symbol and string keys (it’s
    the distinction that should be optional)

  3. It’s conceptually simpler: a Symbol is an immutable String.

I’m sure there are a few more. On the downside, Symbols might not be
as efficient in general, and there could be some back-compatibility
issues.

Would be interesting to know what effectively killed the official
attempt at this.

T.


#5

On 5/12/07, Trans removed_email_address@domain.invalid wrote:

There are a number of advantages to sub-classing that I can think of:

  1. No need to do x.to_s.some_string_method.to_sym

Well, let’s see. Why do we do symbol.to_s ?

1). When we want a string representation of the symbol so that we

can say mutate it. Subclassing won’t help here.
2) If we want to compare a string with a symbol. Making Symbol a
subclass of string alone won’t do this, and if we change Symbol#== so
that :a == “a” is true we destroy one of big advantages of Symbols
which is the speed of determining that two symbols are equal based on
their identity, this is why, for example, symbols rather than strings
are used as method selectors.

And why do we do string.to_sym, primarily because we want the speed
advantages of symbols in comparisons.

  1. Hash keys could efficiently equate symbol and string keys (it’s
    the distinction that should be optional)

No I think that we’d actually get the worst here, it falls out of #2
above. Symbol hash keys are more efficient than String hash keys
because of identity.

  1. It’s conceptually simpler: a Symbol is an immutable String.
    No it isn’t. A symbol is an object with an immutable string as a
    name, and which is the sole instance with that name.

Now an interesting idea might be to add more string methods to Symbol
so that for example one could do

:a + :b #=> :ab

So that there was more of a Symbol algebra which was still closed so
that the results were still Symbols.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/


#6

On 5/12/07, Rick DeNatale removed_email_address@domain.invalid wrote:

On 5/12/07, Trans removed_email_address@domain.invalid wrote:

Now an interesting idea might be to add more string methods to Symbol
so that for example one could do

:a <=> :b and including Compareable automatically
I think that would be the most useful.
> Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

For the rest I rather agree with Rick’s POV.
If one subeclasses a class X with a class Y, one conceptionally says
“an instance of Y” is “an instance of X”.
Could you say a Symbol is a String? No you cannot unless a Symbol
responds to all messages of a String. In other words, subclasses
extend the behavior of baseclasses they never restrict it.
Well that holds for my old OO stuff I have learnt, maybe I have to
change paradigm, but right now I am not convinced.

Cheers
Robert


#7

On 5/13/07, Xavier N. removed_email_address@domain.invalid wrote:

Just wanted to point out that the original question is why Ruby core
changed their mind, not what people think in general about relating
String and Symbol. Perhaps the question could be sent to ruby-core as
well.
That is indeed a good idea

– fxn

Neverheless we are spot on the thread, are we not? And even if we were
drifiting to a related topic that sometimes gives the best
discussions.

But maybe our arguments are not convincing?
What would you want to discuss then?

I do not feel one should be that rigid about OnTopic OffTopic.
Well just my 0.02 whatever money you worship.

Cheers
Robert


#8

On May 13, 2007, at 10:32 AM, Robert D. wrote:

Neverheless we are spot on the thread, are we not? And even if we were
drifiting to a related topic that sometimes gives the best
discussions.

But maybe our arguments are not convincing?

I think that if a couple of simple arguments make clear both classes
should be unrelated the core team wouldn’t even bothered to start
relating them. So I guess it’s likely that there’s more into it and I
would like to know about it.

What would you want to discuss then?

I do not feel one should be that rigid about OnTopic OffTopic.
Well just my 0.02 whatever money you worship.

The discussion itself is OK for me. I just wanted to point out that
the original question has not been answered, otherwise the thread
could engage in talking about what people think in general and forget
it.

– fxn


#9

On Saturday 12 May 2007 07:20, enduro wrote:

topic again and I’ve tried to collect the reasons
Thank you very much

Sven

basic reason - as stated in ruby hacking guide is that Symbol internally
is
just Int !

that makes hash based on symbols much much faster, as a consequence of
above
Symbol is “read-only” and you can modify String as much as you want.

so to sum up - Symbols are smaller, faster, but “read-only”, good for
indexing
hashes - passing methods names etc.
Strings - heavy, slow, but with greater flexability,

if you want more in deep explenations read ruby internals/hacking guide


#10

Just wanted to point out that the original question is why Ruby core
changed their mind, not what people think in general about relating
String and Symbol. Perhaps the question could be sent to ruby-core as
well.

– fxn


#11

Hi,

In message “Re: Why was the “Symbol is a String”-idea dropped?”
on Sun, 13 May 2007 17:20:49 +0900, Xavier N. removed_email_address@domain.invalid
writes:

|Just wanted to point out that the original question is why Ruby core
|changed their mind, not what people think in general about relating
|String and Symbol. Perhaps the question could be sent to ruby-core as
|well.

We once changed Symbol as subclass of String to see how it goes, and
met many compatibility problems. People distinguishes Symbols and
String far more than we expected. So we just abandoned the idea.

          matz.

#12

On 5/13/07, Xavier N. removed_email_address@domain.invalid wrote:

Neverheless we are spot on the thread, are we not? And even if we were
drifiting to a related topic that sometimes gives the best
discussions.

But maybe our arguments are not convincing?

I think that if a couple of simple arguments make clear both classes
should be unrelated the core team wouldn’t even bothered to start
relating them.
I have the highest respect for the community that works on Ruby2.0.
That however does not make them gods, and they can therefore err.
On one hand I do not bother with the consideration why the have
thought about it when we discussed technical issues - for right or
wrong.

However and I thank you for pointing this out (and reexplaining it,
because I can be quite stubborn (pourquoi penses-tu que je suis marié
avec une Bretonne;) they might indeed have had some conceptional ideas
that might be interesting.
This would kill the idea of symbols in the general sense (Smalltalk,
Lisp and Ruby1.8), maybe this was what made them back off?

Sorry if I was slightly aggressive but I still feel that you post was
a little it too severe with us ;).
No the slight misunderstanding came from my failure to understand what
you wanted to say, my fault without doubt.

would like to know about it.

What would you want to discuss then?
That was a stupid question of YHS, I know now, what you wanted to talk
about :slight_smile:

I do not feel one should be that rigid about OnTopic OffTopic.
Well just my 0.02 whatever money you worship.

The discussion itself is OK for me. I just wanted to point out that
the original question has not been answered, otherwise the thread
could engage in talking about what people think in general and forget
it.
Sure but I still have a much more relaxed POV about this, but please
believe me I respect yours too.

– fxn

Cheers
Robert


#13

On May 13, 12:07 pm, Yukihiro M. removed_email_address@domain.invalid wrote:

We once changed Symbol as subclass of String to see how it goes, and
met many compatibility problems. People distinguishes Symbols and
String far more than we expected. So we just abandoned the idea.

That’s unfortunate. IMHO it’s generally bad practice to functionally
differentiate between them. But this being the official status now, I
don’t see any reason to accept string hash keys for method options
anymore. It’s just not worth the extra code and computation to do so.

T.


#14

Rick,

Aren’t we confusing symbol with operator in this discussion. If I am
dealing
with a program as a string or group of strings, as any compiler
initially must,
not having symbols as a part of strings makes my initial task almost
impossible.

Everett L.(Rett) Williams II


#15

Thank you all for your replies.

And thank you, Xavier, for keeping the focus on my original intention.
Yes, I was not asking about general arguments for designing a class
hierarchy, but for the reasons for this particular decision of the
ruby-core team.

And I was indeed enlightened by matz’s answer:

|String and Symbol. Perhaps the question could be sent to ruby-core as
|well.

We once changed Symbol as subclass of String to see how it goes, and
met many compatibility problems. People distinguishes Symbols and
String far more than we expected. So we just abandoned the idea.

This tells me, that it was mainly the weight of the existing ruby usage,
that flipped the balance towards the conservative side.

Or, in other words: if the decision to unify Symbol and String would
have been taken at early stages of Ruby development, then the
general usage would have adapted to this, and …
we might be happier with the result today.

At least, that is my private opinion on this question:

It is tempting to say: “Symbols are just integers internally,
they are just isolated points in ‘Symbol-space’,
so it is not suitable to give them string methods.”
But I think in practice this is not true:

  • Symbols are a standard data type for meta-programming
    (and immediately, there will be a need to append a “?” here and then,
    or test for some regexp-condition…)
  • Symbols are fast as Hash keys,
    but the “real-world” keys often are strings, or even can be both,
    and then the current situation creates the well-known dilemma
    to decide for a Symbol/String interface (and implement it).
    Yes, this gives us programmers the freedom to optimize the code…
    (… but I think a standard solution would serve better in most
    cases.)

Yes, I sometimes think of that separation of Symbol from String
as a tiny impurity in the Ruby crystal.

I thought Ruby 2.0 could have been a chance to iron this out.
But it seems that now only small changes are still possible.

So, I’ll just have to come to terms with it. :slight_smile:
(And I will, of course – there are enough other fascinating issues…
:slight_smile: )

Along the lines of Trans:

That’s unfortunate. IMHO it’s generally bad practice to functionally
differentiate between them. But this being the official status now, I
don’t see any reason to accept string hash keys for method options
anymore. It’s just not worth the extra code and computation to do so.

T.

Before I close, just a small thought regarding the issue that
subclasses are usually extended from their superclass, and not
restricted.
I don’t know if that had been discussed before: would it perhaps be good
to
create a class hierarchy similar to the Float/Integer hierarchy?
String < Stringlike
Symbol < Stringlike
Of course with everything set up such that hash[:a] is the same as
hash[“a”] etc.
(Just a thought, probably this already has been rejected.)

Anyway, I’d like to thank the core programmers for all the work
they have put into Ruby to make it shine.
Kind regards,
Sven


#16

On Tue, May 15, 2007 at 10:07:24AM +0900, enduro wrote:

to decide for a Symbol/String interface (and implement it).
The programs for which it makes sense to convert strings (received from
some
external source, e.g. a database) to symbols for optimisation purposes,
i.e.
where the benefits are measurable, will be pretty few. And you also open
yourself to a symbol exhaustion denial-of-service.

That is, as far as I know, the symbol table is never garbage collected.
Once
a symbol, always a symbol.

So using literal symbols as hash keys makes sense:

{ :foo=>1, :bar=>2 }

but using

h = {}
h[a.to_sym] => 1

is risky, and unlikely to yield measurable benefit. If ‘a’ is already a
String, then there is no benefit from avoiding object creation, since
it’s
been already done. So you may as well leave it as a String.

Yes, I sometimes think of that separation of Symbol from String
as a tiny impurity in the Ruby crystal.

I would disagree with you there, because Symbols are clean and easy to
understand.

There are other “impurities” I can think of - like the seven or so
different
flavours of Proc object which have subtle different semantics. This I
find
more difficult, because it’s really hard to remember the rules for how
they
differ. But things like this are here to make the language “do the right
thing” in most practical cases. And, once you’ve used Ruby for a while,
you
find that actually it does.

I thought Ruby 2.0 could have been a chance to iron this out.
But it seems that now only small changes are still possible.

I’d vote strongly against anyway. I like Symbols as they are. I also
don’t
feel a dichotomy. Use a symbol where necessary (i.e. for method names)
and
for literal hash keys, e.g. named arguments. For anything else a string
is
just fine.

I agree it’s a bit annoying when you come across a bit of code which
violates the standard practice: e.g. net/telnet uses
{ “Prompt” => /foo/ }
instead of
{ :prompt => /foo/ }

But then even :Prompt would have been annoying, because generally people
don’t use the capitalisation either.

Do you think that hash[‘a’] and hash[‘A’] should be the same?

Regards,

Brian.


#17

Hello Brian,

Brian C. wrote:

external source, e.g. a database) to symbols for optimisation purposes, i.e.
where the benefits are measurable, will be pretty few.

Yes, I agree.
(That’s what I tried to address by the two lines after the quote above,
perhaps I should have put a smiley in there :slight_smile: )

And you also open yourself to a symbol exhaustion denial-of-service.

Yes, of course.
But my point is: Let the system take care of that.
I want a Ruby that just works - crystal-clear, transparently, reliably.
:slight_smile:
And it already does in most cases. And there is a lot that can be
improved.
And one such improvements could be a garbage collection for symbols. (I
think.)

That is, as far as I know, the symbol table is never garbage collected. Once
a symbol, always a symbol.

I’m not a core programmer, maybe i am asking to much,
but I think it should be possible without slowing anything down.
One very simple idea I can think of, is the following:
Set a limit to the number of symbols and if it is reached
the GC wil be invoked in a special symbol-mode, marking all symbols that
are
still in use and completely re-generates the symbol-table from scratch.

Yes, I sometimes think of that separation of Symbol from String
as a tiny impurity in the Ruby crystal.

I would disagree with you there, because Symbols are clean and easy to
understand.

Yes, I really must admit, I also like the cleanness of current Symbols.
But then, my experience is that this clearness is not worth a lot,
because the border towards “dirty” strings must be crossed often.
(That’s why I called sticking to the clearness “temping” in my last
post.)

There are other “impurities” I can think of - like the seven or so different
flavours of Proc object which have subtle different semantics. This I find
more difficult, because it’s really hard to remember the rules for how they
differ.

Fully agree! But that must be a different thread.

But things like this are here to make the language “do the right
thing” in most practical cases. And, once you’ve used Ruby for a while, you
find that actually it does.

OK. But that can be said for most programming languages.
We are dealing with Ruby here,
and the appealing thing of Ruby is: the language!
I mean: concise syntax, flexiblity, expressiveness,
allowing to express things naturally.
Ruby is not yet good in many other aspects:
speed, threads, documentation.
But the runtime engine can be improved with time,
documentation can grow.
The language is the crystal. It must be good in the beginning,
it becomes more solid with every project written in that language.

So, I’d like to use the time we still have before Ruby 2 is born,
to contribute to a really good language.

Do you think that hash[‘a’] and hash[‘A’] should be the same?

No, not for the builtin Hash#[].

So long

Sven


#18

On 15.05.2007 03:07, enduro wrote:

We once changed Symbol as subclass of String to see how it goes, and
met many compatibility problems. People distinguishes Symbols and
String far more than we expected. So we just abandoned the idea.

This tells me, that it was mainly the weight of the existing ruby usage,
that flipped the balance towards the conservative side.

Which is not a bad thing in itself.

Or, in other words: if the decision to unify Symbol and String would
have been taken at early stages of Ruby development, then the
general usage would have adapted to this, and …
we might be happier with the result today.

I am in no way unhappy with the way it is today. Strings and symbols
serve different purposes although there is some overlap. I rarely feel
the need to convert between the two.

  • Symbols are fast as Hash keys,
    but the “real-world” keys often are strings, or even can be both,
    and then the current situation creates the well-known dilemma
    to decide for a Symbol/String interface (and implement it).

I am not aware of a situation where you would need to mix them as hash
keys. And to make the distinction is pretty easy most of the time IMHO.

Yes, this gives us programmers the freedom to optimize the code…
(… but I think a standard solution would serve better in most cases.)

Frankly, I believe there is an inherent advantage that you can use
symbols vs. strings in code. And I mean not only performance wise but
also readability wise.

Note though, that all these issues have nothing to do with the question
whether String and Symbol should be connected inheritance wise. IMHO
that’s mostly an implementation decision in Ruby.

Yes, I sometimes think of that separation of Symbol from String
as a tiny impurity in the Ruby crystal.

Personally I believe it creates more expressiveness. If you view this
as impurity, there are a lot of them in Ruby because Ruby’s focus has
always been on pragmatism and not purity (although it goes pretty far in
some areas, for example it avoids the POD vs. object distinction that
Java has (I would say this is a pragmatic decision because it makes
things easier if you have a common base class for all types)).

I thought Ruby 2.0 could have been a chance to iron this out.
But it seems that now only small changes are still possible.

From what I gather Ruby 2.0 will have some major changes, for example
in the area of scoping. Though it’s probably done in a way that it will
reduce the impact on existing programs.

So, I’ll just have to come to terms with it. :slight_smile:
(And I will, of course – there are enough other fascinating issues…
:slight_smile: )

The capability to adjust to reality is a useful one IMHO. :slight_smile:

Before I close, just a small thought regarding the issue that
subclasses are usually extended from their superclass, and not restricted.
I don’t know if that had been discussed before: would it perhaps be good to
create a class hierarchy similar to the Float/Integer hierarchy?
String < Stringlike
Symbol < Stringlike

Why not? StringLike could even be a module that relies solely on [] and
length to do all the non mutating stuff.

Of course with everything set up such that hash[:a] is the same as
hash[“a”] etc.
(Just a thought, probably this already has been rejected.)

I’m not sure whether this is a good idea. Given the fact that I don’t
mix symbols and strings as Hash keys I wouldn’t benefit - but it would
not hurt me either. :slight_smile: YMMV

Anyway, I’d like to thank the core programmers for all the work
they have put into Ruby to make it shine.

Definitively! Credits also go to the community that is still among the
most civilized online communities I know so far!

Kind regards

robert


#19

On 5/15/07, enduro removed_email_address@domain.invalid wrote:

Thank you all for your replies.

And thank you, Xavier, for keeping the focus on my original intention.
Yes, I was not asking about general arguments for designing a class
hierarchy, but for the reasons for this particular decision of the
ruby-core team.
I really have not taken offense. However if you are interested in that
only you might post to ruby-core only.
I am kind of surprised that the considerations of Rick and YHS are
considered as OT.
If you do not like them maybe it would be polite to ignore them. But
talking about the topic on this list and ignoring all background
information about what symbols are and have been is kind of weird.
Please remember that Ruby has its inheritance in other languages
owning symbols as I believe to have pointed out.
The fact that the original idea is a big paradigm shift does not
answer your question?

I honestly do not understand that.

Threads just evolve I do not feel that they belong to OP :).
They do not belong to me either of course ;).
Cheers
Robert


#20

On 5/14/07, Trans removed_email_address@domain.invalid wrote:

|String and Symbol. Perhaps the question could be sent to ruby-core as

T.

I really like the idea of using symbols as parameter keys exclusively,
I think we would get closer to named parameters instead of emulating
them.
And the interesting stuff is, I always hated String keys in parameter
hashes.
Does this go together with the fact that I really like the good old
Symbols are not Strings paradigm? Probably.

But right to now I fail to see what would be the gain from making
symbols mutable.
I still maintain the POV that immutable Symbols must not be a subclass
of String.

Any more thoughts?

Cheers
Robert