Symbols and Strings

Hello, all…

A modest proposal here… I think (tentatively) that Symbol should
inherit
from String, and a symbol should basically be an immutable string.

Reply here and/or on my blog as you see fit:

Cheers,
Hal F.

On Jun 18, 2012, at 15:09, Hal F. wrote:

Hello, all…

A modest proposal here… I think (tentatively) that Symbol should inherit
from String, and a symbol should basically be an immutable string.

Reply here and/or on my blog as you see fit: http://rubyhacker.wordpress.com

This happened (see thread at [ruby-core:9188]) but was reverted. I
can’t think of the proper keywords to find the reason for the reversion.

Silly of me. I not recall this at all, and even believed it was an
original idea…

Those who cannot remember history are condemned to Google it…

Hal

For the record, this:
http://ruby.11.n6.nabble.com/Bikeshed-No-more-Symbol-lt-String-td3558662.html

On 19 June 2012 09:00, Hal F. [email protected] wrote:

This happened (see thread at [ruby-core:9188]) but was reverted. I can’t
think of the proper keywords to find the reason for the reversion.


Matthew K., B.Sc (CompSci) (Hons)
http://matthew.kerwin.net.au/
ABN: 59-013-727-651

“You’ll never find a programming language that frees
you from the burden of clarifying your ideas.” - xkcd

On Tue, Jun 19, 2012 at 12:31 AM, Eric H. [email protected]
wrote:

This happened (see thread at [ruby-core:9188]) but was reverted. I can’t think
of the proper keywords to find the reason for the reversion.

I one has Symbol inherit String then “is a” relationship is violated.
You cannot use the sub class where you can use the super class (e.g.
try to append to a frozen String). See David’s remark
http://ruby.11.n6.nabble.com/Bikeshed-No-more-Symbol-String-tp3558662p3558699.html

Cheers

robert

Personally, I don’t have a problem with “reducing the contract”
of a String. Freezing an object also reduces its contract.

Hal

On Tue, Jun 19, 2012 at 8:35 AM, Robert K.

I’ve liked the distinction between symbols and strings since I first
encountered it in LISP. Symbols to me are abstract entities which
happen to be typically represented with characters, while strings are
explicitly series of characters. With this perspective, it makes sense
that “b” > “a”, but :b > :a gives an error, and :a + :b gives an error.
If symbols became immutable strings, this distinction would be lost.

But in the end, the proof is in the code. I suspect functionally
there’d not be much difference, but I’m just a beginner.

On Tue, Jun 19, 2012 at 5:17 PM, Hal F. [email protected]
wrote:

Personally, I don’t have a problem with “reducing the contract”
of a String.

It’s just that Ruby is not particularly suited to using inheritance in
different ways. In Eiffel you can do all this and do it visibly. In
Ruby there’s just mixing in modules and class inheritance.

Freezing an object also reduces its contract.

Kind of. But I consider that a special case because freeze prevents
all mutations but not other operations. It also does not restrict the
range of valid state - it just freezes it.

Kind regards

robert

On Mon, Jun 18, 2012 at 5:09 PM, Hal F. [email protected]
wrote:

Hello, all…

A modest proposal here… I think (tentatively) that Symbol should inherit
from String, and a symbol should basically be an immutable string.

Reply here and/or on my blog as you see fit: http://rubyhacker.wordpress.com

Cheers,
Hal F.

A discussion on this just came up on Stack Overflow:

Freezing an object also reduces its contract.

Kind of. But I consider that a special case because freeze prevents
all mutations but not other operations. It also does not restrict the
range of valid state - it just freezes it.

Hmm. If a Symbol is-a frozen String, how does that reduce the range
of valid state any more than a “real” frozen String? What operations
other than mutations are prohibited?

Hal

On 20 June 2012 02:58, Dan C. [email protected] wrote:

I’ve liked the distinction between symbols and strings since I first
encountered it in LISP. Symbols to me are abstract entities which
happen to be typically represented with characters, while strings are
explicitly series of characters. With this perspective, it makes sense
that “b” > “a”, but :b > :a gives an error, and :a + :b gives an error.
If symbols became immutable strings, this distinction would be lost.

But in the end, the proof is in the code. I suspect functionally
there’d not be much difference, but I’m just a beginner.

I’m pretty sure I’m not a beginner (although programming changes so
fast it’s almost impossible to be old-hat in anything relevant) but I
agree pretty much in whole with what you’re saying. In a
compiled/pre-parsed/JIT environment there could be a dramatic
difference between symbols and immutable strings symbols could
easily be replaced with some enumerator type, or integers, or even
completely optimised away, depending on context.

To my way of thinking symbols, when they’re called symbols, are
“valueless data” their worth is in their name. Immutable strings
do contain valuable data, even if you’re not allowed to modify it.
(If that wasn’t the case, Java would be an even harder place to get
anything useful done.)

By extension I’d argue that a symbol is atomic; the whole name is
valuable, but no part of it is. As such an .each_char iterator could
make perfect sense for an immutable string, but not for a symbol.

I always thought the :“foo” syntax was handy (for :“foo#{bar}baz”
cases), but it only ever served to confuse the issue for me.
Personally I tend to use String#to_sym for those cases that I want to
dynamically generate a symbol; it’s an explicit cast, and provides a
clear boundary between “creating the name” and then “using it”. Note:
it may be more or less optimal at runtime to do it this way, but I
don’t care; I’m after maintainability and easing my own understanding
here.

I should get back to work.

Matthew K., B.Sc (CompSci) (Hons)
http://matthew.kerwin.net.au/
ABN: 59-013-727-651

“You’ll never find a programming language that frees
you from the burden of clarifying your ideas.” - xkcd

On Tue, Jun 19, 2012 at 10:12 PM, Hal F. [email protected]
wrote:

Freezing an object also reduces its contract.

Kind of. But I consider that a special case because freeze prevents
all mutations but not other operations. It also does not restrict the
range of valid state - it just freezes it.

Hmm. If a Symbol is-a frozen String, how does that reduce the range
of valid state any more than a “real” frozen String?

Freezing does not reduce the state, which is what I said above.
With inheritance Symbol is-a String - not “frozen String”. If at all
the inheritance would be the other way round: a String is-a Symbol and
it would extend the contract by mutating methods.

What operations
other than mutations are prohibited?

For example: String has method << which is part of the public contract
and that will be broken by all Symbol instances. So you give
someone something and say “this is a String” but in reality it is not
because all instances lack many of the functionality of String.

While with actually freezing Strings << works most of the time and
there are just some instances which are in a state (frozen) which does
not allow successful execution of the method.

Kind regards

robert

On 20/06/2012, at 7:12 PM, Robert K. wrote:

If at all
the inheritance would be the other way round: a String is-a Symbol and
it would extend the contract by mutating methods.

This is a very good point. String should be a sub class of Symbol. I
wonder if this was considered.

Henry

On 21/06/2012, at 8:50 AM, Bartosz Dziewoński wrote:

This is an extremely bad idea.
Well a lot of people, including Matz, disagree with you.

Symbol is just that – a symbol. Internally it is stored as a number
and in C code passed by-value, like Fixnum, instead of by-ref, like
all other objects. It is only shown as a bit of text for display.

The internal representation is of no concern to the programmer, only
it’s utility.
There seems to be a desire to be able to use String and Symbol
interchangeably, hence this discussion.

Henry

2012/6/20 Henry M. [email protected]:

On 21/06/2012, at 8:50 AM, Bartosz Dziewoński wrote:

This is an extremely bad idea.

Well a lot of people, including Matz, disagree with you.

I’d like to see where he explicitly disagrees with this.

The internal representation is of no concern to the programmer, only it’s
utility.

Wrong. It’s as if you said that a linked list and an array are the
same thing and the difference in the implementation is of no concern
to the programmer, since both can support the same interface. Ruby
Strings and Symbols are fundamentally different on every level (as I
explained), and should be used in different contexts, for both code
clarity and performance.

There seems to be a desire to be able to use String and Symbol interchangeably,
hence this discussion.

If Symbol and String have the same function, then one of them should
probably be removed. (Except they don’t, in most cases; in the few
cases where they do the programmer should probably suck it up, choose
one representation, stick to it and convert input data to it himself.)

– Matma R.

2012/6/20 Henry M. [email protected]:

On 20/06/2012, at 7:12 PM, Robert K. wrote:

If at all
the inheritance would be the other way round: a String is-a Symbol and
it would extend the contract by mutating methods.

This is a very good point. String should be a sub class of Symbol. I wonder if
this was considered.

This is an extremely bad idea.

Symbol is just that – a symbol. Internally it is stored as a number
and in C code passed by-value, like Fixnum, instead of by-ref, like
all other objects. It is only shown as a bit of text for display.

Symbols are not immutable strings, and Strings are not mutable
symbols. You should not confuse the two, and they should not inherit
from each other. They are extremely different entities. Think of
symbols as a crossover between enums, magic numeric constants and – as
the last place – strings.

– Matma R.

On Wed, Jun 20, 2012 at 5:06 PM, Jeremy B. [email protected] wrote:

I suppose that’s the biggest area of concern for me.

I do sometimes want to perform stringlike operations, e.g., on the
parameter
into method_missing, but that merely requires a single to_s. Likewise I
sometimes
want to add an equal sign onto a symbol in metaprogramming:

(name.to_s << "=").to_sym

Hal

On Thu, Jun 21, 2012 at 12:25 AM, Hal F. [email protected]
wrote:

use apparently equivalent symbols and strings as fully equivalent hash
keys.

Even that would not be solved by inheritance because instance of
different classes can never be equivalent.

Kind regards

robert

PS: I said “If at all…” - so I am not a fan of String as subclass of
Symbol either.

On 06/20/2012 03:58 PM, Henry M. wrote:

On 21/06/2012, at 8:50 AM, Bartosz Dziewoński wrote:

Symbol is just that – a symbol. Internally it is stored as a number
and in C code passed by-value, like Fixnum, instead of by-ref, like
all other objects. It is only shown as a bit of text for display.

The internal representation is of no concern to the programmer, only it’s
utility.
There seems to be a desire to be able to use String and Symbol interchangeably,
hence this discussion.

Given that Strings and Symbols are different entities (as has been
discussed at length here and other places), it’s not possible to expect
them to be interchangeable in all cases. This even holds true if one
class inherits from the other. After all, the point of descending one
class from another is to make something related but at least a little
different. Therefore, interchangeability must be limited to some extent
as long as Symbol and String are different things.

So then, in what cases would it be helpful to use them interchangeably?
Is it possible to define a common, useful interface that both Symbol
and String could implement. If so, perhaps a class or module could be
defined for that interface and incorporated into both classes as
appropriate.

From what I’ve seen, the most frequently voiced desire is to be able to
use apparently equivalent symbols and strings as fully equivalent hash
keys. What else is there?

-Jeremy

On 06/20/2012 05:43 PM, Robert K. wrote:

From what I’ve seen, the most frequently voiced desire is to be able to
use apparently equivalent symbols and strings as fully equivalent hash
keys.

Even that would not be solved by inheritance because instance of
different classes can never be equivalent.

I didn’t mean to imply that it would. Personally, I prefer having Hash
be picky about these things. That generality allows it to be used as
the basis for something like HashWithIndifferentAccess without much
fuss. Doing things the other way around would be much more difficult.

To be honest, the best way to handle this particular issue would
probably be to add some syntactic sugar that makes it as easy to
instantiate HashWithIndifferentAccess as it is to instantiate Hash.
Assuming HashWithIndifferentAccess is available, maybe something like
this:

def h(regular_hash)
HashWithIndifferentAccess.new(regular_hash)
end

hash = h :a => 1, ‘b’ => 2
puts hash[‘a’] #=> 1
puts hash[:b] #=> 2

PS: I said “If at all…” - so I am not a fan of String as subclass of
Symbol either.

To me, Symbol and String really are very different things. Symbol
effectively has a frozen String instance associated with it but is not
itself the String instance. The String#to_sym method is just a
convenience method to perform a reverse lookup on such associations,
creating them if they don’t already exist.

In other words, there is an implicit hash that maps Symbol instances to
String instances (a… ahem… symbol table). Saying that Symbol and
String should have some sort of direct ancestral class relationship with
one another based on that fact alone is like saying that the keys and
values in your own hashes should always have such a relationship as
well. Clearly, or perhaps not so clearly, that’s nonsense. :slight_smile:

-Jeremy