Language deisng question and stinrgs and object id's

dubstep · January 14, 2012, 3:38pm

a = “string”
b = “string”

causes three objects with different object id’s to be created.

irb(main):001:0> a = “string”
=> “string”
irb(main):002:0> b = “string”
=> “string”
irb(main):003:0> “string”.object_id
=> 19087248
irb(main):004:0> a.object_id
=> 21660552
irb(main):005:0> b.object_id
=> 22419972

Yet
a = “string”
b = a
only causes two objects to be created.

irb(main):001:0> a = “string”
=> “string”
irb(main):002:0> b = a
=> “string”
irb(main):003:0> “string”.object_id
=> 22106544
irb(main):004:0> a.object_id
=> 22650984
irb(main):005:0> b.object_id
=> 22650984

Yes, I know that one can do

a = “string”
b = a.clone

I understand what’s happening. I just don’t know why the language
designer(s) decided on – what is to me – surprising behavior.

So I have a Car Talk Puzzler (for which I do not have an answer): How
would I initialize a bunch of strings so that they are clones rather
than the same object?

In other words how would I do the following which is invalid Ruby.

a = b.clone = c.clone = “How to clone?”

ralphshnelvar · January 14, 2012, 3:57pm

On Sat, Jan 14, 2012 at 3:38 PM, Ralph S. [email protected]
wrote:

a = “string”
b = “string”

causes three objects with different object id’s to be created.

It actually causes just two objects to be created.

irb(main):001:0> a = “string”
=> “string”
irb(main):002:0> b = “string”
=> “string”
irb(main):003:0> “string”.object_id
=> 19087248

This one is a new and different thing that is not related to your
above two lines of code at all.

irb(main):004:0> a.object_id
=> 21660552
irb(main):005:0> b.object_id
=> 22419972

I think what you are missing is that a literal String creates a new
string object. The fact that the characters are the same doesn’t mean
it’s the same object:

1.9.2p290 :001 > “string”.object_id
=> 14124100
1.9.2p290 :002 > “string”.object_id
=> 14101660
1.9.2p290 :003 > “string”.object_id
=> 14081660
1.9.2p290 :004 > “string”.object_id
=> 14068260

Yet
a = “string”
b = a
only causes two objects to be created.

It actually only causes one object to be created, see above.

irb(main):001:0> a = “string”
=> “string”
irb(main):002:0> b = a
=> “string”
irb(main):003:0> “string”.object_id
=> 22106544
irb(main):004:0> a.object_id
=> 22650984
irb(main):005:0> b.object_id
=> 22650984

A possible misunderstanding here would be to miss the difference
between an object and a variable. A variable is a reference to an
object. Several variables can reference the same object. Assigning
variables only changes what they reference, not the objects
themselves.

Yes, I know that one can do

a = “string”
b = a.clone

I understand what’s happening. I just don’t know why the language designer(s)
decided on – what is to me – surprising behavior.

I think that having objects be a separate concept from the variables
that reference them is pretty common, and provides very nice language
semantics.
Just think that a literal string is a new object every time it
appears, the same with hashes, for example.

So I have a Car Talk Puzzler (for which I do not have an answer): How would I
initialize a bunch of strings so that they are clones rather than the same object?

In other words how would I do the following which is invalid Ruby.

a = b.clone = c.clone = “How to clone?”

1.9.2p290 :005 > a = (b = (c = “How to clone?”).clone).clone
=> “How to clone?”
1.9.2p290 :006 > [a.object_id, b.object_id, c.object_id]
=> [14045460, 14045480, 14045520]
1.9.2p290 :007 > [a,b,c]
=> [“How to clone?”, “How to clone?”, “How to clone?”]

Jesus.

ralphshnelvar · January 14, 2012, 3:59pm

Hi,

In message “Re: Language deisng question and stinrgs and object id’s”
on Sat, 14 Jan 2012 23:38:00 +0900, Ralph S.
[email protected] writes:

|I understand what’s happening. I just don’t know why the language designer(s)
decided on – what is to me – surprising behavior.

String literals create clone of strings each time they are evaluated,
because strings in Ruby are mutable, so I’d like to avoid literal
modification bugs. The other option is to make string literals
immutable, but that what I didn’t choose.

Strings from literals shares body so that they don’t consumes much
memory or copying cost.

          matz.

ralphshnelvar · January 14, 2012, 4:17pm

JGyG> On Sat, Jan 14, 2012 at 3:38 PM, Ralph S. [email protected]
wrote:

ralphshnelvar · January 14, 2012, 5:04pm

On Sat, Jan 14, 2012 at 4:16 PM, Ralph S. [email protected]
wrote:

In other words, is the first “string” object created and then cloned?

Is there a way to see the object id’s of the variable “a” and “string” in the
first statement?

I think that’s exactly what is causing your misunderstanding. The
variable a is not an object. It doesn’t have an object_id. It’s just a
reference, a pointer to an object. The only object in a = “string”, is
the object created by Ruby when it evaluates the string literal. When
you then say a.object_id, what you are doing is sending the message
“object_id” to the object referenced by a.

Jesus.

ralphshnelvar · January 15, 2012, 2:41am

Ralph S. [email protected] wrote:

Thank you. It is a good explanation.

Is there a way to see the object id’s of the variable “a” and “string” in
the first statement?

You said it was a good explanation, but your question makes it obvious
that
you did not understand it.

Allow me to explain the way I think about these things. This is a
simplification that ignores several key points, but it is a useful
mental
model.

We deal with two things in Ruby: names and objects. Objects have IDs
and
reference counts but no names, and live in a fuzzy cloud out in memory
somewhere. Names do not have IDs, and live in dictionaries.

A name is bound to exactly one object. An object can be bound to many
names, and it doesn’t know which names they are. The object simply
exists
without a name.

When you say this in Ruby:
"hello
that creates a nameless string object in the fuzzy cloud. That object
has
no references, so it will almost immediately be removed by the garbage
collector.

When you say this:
a = “hello”
that also creates a nameless string object in the fuzzy cloud. That
object
is then bound to the name “a”. “a” is not an object. It is a name that
is
bound to an object. In that example, there is only ONE object. Since
the
object itself is nameless, the only way to get to it is through its
bindings. So, to find the object ID of “hello”, we have to go through
the
name “a”.

Now, say I do this:
b = “hello”
This creates a brand new nameless string object in the fuzzy cloud, and
binds that object to the name “b”.

Now, say I do this:
c = b
This does NOT create any new objects. Instead, it takes the object
bound
to the name “b”, and binds it to the name “c”. That nameless string
object
now has two bindings. It doesn’t know the names of its bindings, but it
knows how many there are.

So, when you ask this:

Is there a way to see the object id’s of the variable “a” and “string” in
the first statement?

the question doesn’t make sense. When you say “a.object_id”, you are
asking the name “a” to tell you the object ID of the object to which it
is
bound. That IS the object ID of the string. You cannot ask for the
object
ID of “string” except to go through one of the names to which it is
bound.
When you say
“string”.object_id
that creates a brand new object without a name, and tells you its ID.
That
object will immediately be deleted, since it has no references.

ralphshnelvar · January 14, 2012, 5:29pm

Jes

ralphshnelvar · January 15, 2012, 3:02am

On Sat, Jan 14, 2012 at 20:41, Tim R. [email protected] wrote:

Ralph S. [email protected] wrote:

Thank you. It is a good explanation.

Is there a way to see the object id’s of the variable “a” and “string” in
the first statement?

You said it was a good explanation, but your question makes it obvious that
you did not understand it.

To stick in my oar: Ralph, if you’re familiar with pointers in
languages like C and C++, you can think of these “references” Tim
speaks of, as being like pointers, and the variables as being like,
well, variables. You can assign a pointer value to any number of
variables, and assign one variable to another, which just copies the
pointer… or you can assign a variable a new pointer value. (If
you’re not familiar with pointers, then, well, never mind…)

-Dave