Hi, I have programmed in various languages previously, but am new to Ruby. So far am very impressed with it; but there is one behaviour I find quite alarming, which stems from the fact that Ruby treats strings as objects rather than as primitives. For instance: # a) Number myNum1 = 5 myNum2 = myNum1 myNum2 = 3 => myNum1 = 5 => myNum2 = 3 which is what I would expect. However, # b) String myString3 = "Fred Nerk" myString4 = myString3 myString4[0,4] = "Bert" => myString3 = "Bert Nerk" => myString4 = "Bert Nerk" myString3 has been "corrupted", presumably because setting myString4 to it actually set myString4's pointer, not its value, in the standard OO fashion. But # c) String literal myString1 = "Fred Bloggs" myString2 = myString1 myString2[0,4] = "Bert" puts "Fred Bloggs = " + "Fred Bloggs" puts "myString2 = " + myString2 (Output) => Fred Bloggs = Fred Bloggs => myString2 = Bert Bloggs Has the "Fred Bloggs" literal not been corrupted? Or did puts just use another (uncorrupted) instance of it? So let's make the first string a constant. That produces the expected behaviour in this case: # d) String constant 1 MyString5 = "Fred Potts" MyString5 = "John Potts" => warning: already initialized constant MyString5 but not in this one: the constant gets "corrupted". # e) String constant 2 MyString6 = "Fred Winterbotham" myString7 = MyString6 myString7[0,4] = "Bert" => MyString6 = "Bert Winterbotham" => myString7 = "Bert Winterbotham" So how to get around this? The following appears to do it: # f) String constant 3 MyString8 = "Fred Shufflebotham" myString9 = MyString8.clone myString9[0,4] = "Bert" => MyString8 = "Fred Shufflebotham" => myString9 = "Bert Shufflebotham" but doesn't it cause a memory leak? Sorry if this question is too elementary.
on 2012-12-23 00:15
on 2012-12-23 02:50
On Sat, Dec 22, 2012 at 5:15 PM, Paul Magnussen <lists@ruby-forum.com> wrote: > # a) Number > > myString3 has been "corrupted", presumably because setting myString4 to > it actually set myString4's pointer, not its value, in the standard OO > fashion. In Ruby parlance, it's a reference; but essentially correct. > puts "Fred Bloggs = " + "Fred Bloggs" > puts "myString2 = " + myString2 > > (Output) > > => Fred Bloggs = Fred Bloggs > => myString2 = Bert Bloggs > > Has the "Fred Bloggs" literal not been corrupted? Or did puts just use > another (uncorrupted) instance of it? A literal can't be changed. Conceptually at least, every time you have "Fred Bloggs" in quotes, you're referring to a different object. > > => myString7 = "Bert Winterbotham" Yes. Note that a "constant" in Ruby is something that will trigger a warning if it's modified, but it's still possible to modify it. A reference to a constant works the same as a reference to a non-constant object. > => MyString8 = "Fred Shufflebotham" > => myString9 = "Bert Shufflebotham" > > but doesn't it cause a memory leak? Well, the memory used by those objects will stay allocated as long as the garbage collector can reach them somehow, i.e. there are variables that point to them or they belong to a collection like an array or hash.
on 2012-12-23 04:27
On Sat, Dec 22, 2012 at 5:15 PM, Paul Magnussen <lists@ruby-forum.com> wrote: > So far am very impressed with it; but there is one behaviour I find > quite alarming, which stems from the fact that Ruby treats strings as > objects rather than as primitives. Better get used to it. Ruby treats *everything* as an object, or perhaps even more literally, a reference.
on 2012-12-23 04:38
On Sat, Dec 22, 2012 at 3:15 PM, Paul Magnussen <lists@ruby-forum.com> wrote: > myString3 has been "corrupted", presumably because setting myString4 to > it actually set myString4's pointer, not its value, in the standard OO > fashion. There's a subtle point here: > myString3 = "Fred Nerk" = is a built in piece of syntax to bind a variable to an object. Variables are not themselves objects, they are transparent references to objects. The only time a variable cannot be transparently replaced by the object it refers to is when it's on the left hand side of an equal sign. So here, "Fred Nerk" uses the string literal to create the string object #<String:0x01234567 "Fred Nerk"> [that is, an object with type String, object_id 0x01234567 and value "Fred Nerk"] on the heap, and binds the variable myString3 to it. > myString4 = myString3 Here, myString3 is transparently replaced by the object it refers to, #<String:0x01234567 "Fred Nerk">, and myString4 is bound to the same object > myString4[0,4] = "Bert" This is the subtle bit. Despite the syntactic sugar, this is *not* an = sign. There is no variable binding = involved here, it is just ruby syntax sugar that gets rewritten to myString.[]=(0, 4, "Bert"). That is, it calls the "[]=" method on the string object, passing it values (0, 4, "Bert"). Again, since myString4 is not on the left hand side of an =, it gets transparently replaced by #<String:0x01234567 "Fred Nerk">, which then gets sent the message []= with arguments (0, 4, "Bert"), and obligingly updates its value. So our object is now #<String:0x01234567 "Bert Nerk"> (note that the object hasn't changed, just its value). > => myString3 = "Bert Nerk" > => myString4 = "Bert Nerk" Again, these are both transparently replaced by the object they refer to, now #<String:0x01234567 "Bert Nerk"> > But > > # c) String literal > > myString1 = "Fred Bloggs" Creates #<String:0x98765432 "Fred Bloggs"> on the heap, binds myString1 to it. > myString2 = myString1 Binds myString2 to #<String:0x98765432 "Fred Bloggs"> > myString2[0,4] = "Bert" Sends []=, (0, 4, "Bert") to #<String:0x98765432 "Fred Bloggs">, which updates itself to #<String:0x98765432 "Bert Bloggs"> > puts "Fred Bloggs = " + "Fred Bloggs" Creates two *new* string object, #<String:0x00001111 "Fred Bloggs = "> and #<String:0x00001112 "Fred Bloggs"> and passes the second one as an argument to the + method of the first, which returns yet another string object, #<String:0x00001113 "Fred Bloggs = Fred Bloggs"> which it passes to "puts" which prints it out. > puts "myString2 = " + myString2 Creates *one* new string object, #<String:0x00001114 "myString2 = ">, and calls its + method with #<String:0x98765432 "Bert Bloggs"> (the transparent replacement for myString2) as an argument. This creates yet another string object, #<String:0x00001115 "myString2 = Bert Bloggs">, which gets passed to puts and printed out. [Note that all the string objects that got created but never had variables bound to them are temporary objects that the garbage collector will take care of at some point] > So how to get around this? The following appears to do it: > > # f) String constant 3 > > MyString8 = "Fred Shufflebotham" > myString9 = MyString8.clone Clone creates a new string object, and sets its value equal to that of the first one. = then binds myString9 to this new object. > myString9[0,4] = "Bert" the []=, 0, 4, "Bert" message is getting sent to the new object > => MyString8 = "Fred Shufflebotham" myString8 is still bound to the first object, which never got sent a message. > => myString9 = "Bert Shufflebotham" myString9 is still bound to the new object, which *did* get sent the []= message and updated its value > but doesn't it cause a memory leak? No, the garbage collector takes care of it. martin
on 2012-12-23 05:04
Paul Magnussen wrote in post #1089966: > Hi, > > I have programmed in various languages previously, but am new to Ruby. > > So far am very impressed with it; but there is one behaviour I find > quite alarming, which stems from the fact that Ruby treats strings as > objects rather than as primitives. Hi, Well, you called strings as "primitives"; did you use JavaScript? :) Anyway, in Python the strings are indeed immutable; but not so in Ruby. That's why you got all the results. Regarding why in Ruby the strings are mutable (and with all the consequences), I will let somebody else explain it. Regards, Bill
on 2012-12-23 07:40
x = "hello" y = "hello" puts x.object_id puts y.object_id puts "hello".object_id --output:-- 2151871380 2151871280 #Not the same as the previous id 2151871220 Quote marks are a String object constructor in ruby. x[0] = "Y" puts x puts y --output:-- Yello hello
on 2012-12-23 08:00
Paul Magnussen wrote in post #1089966: > > Ruby treats strings as > objects rather than as primitives. > Check this out: result = 9.5426.round 3 puts result --output:-- 9.543
on 2012-12-24 00:54
Thanks for all the replies. I notice also that I can force changing of the value (as opposed to the reference) by substituting a trivial expression for the right-hand side of the assignment, e.g. # g) Expression myStringA = "Fred Shufflebotham" myStringB = myStringA + "" myStringB[0,4] = "Bert" => myStringA = "Fred Shufflebotham" => myStringB = "Bert Shufflebotham" But of course it's an utter kludge. Is there really no more elegant way?
on 2012-12-24 01:17
On Dec 24, 2012, at 12:55 AM, Paul Magnussen <lists@ruby-forum.com> wrote: > => myStringA = "Fred Shufflebotham" > => myStringB = "Bert Shufflebotham" > > But of course it's an utter kludge. Is there really no more elegant > way? Sure, either use a Method that doesn't mutate the string but returns a new one instead, like #sub and #gsub. e.g.: myStringA.sub("Fred", "Bert") myStringA.sub(/.{4}/, "Bert") Or properly clone the string before mutating it: myStringB = myStringA.clone myStringB[0,4] = "Bert" Regards, Florian
on 2012-12-24 11:19
Paul Magnussen wrote in post #1090050: > Thanks for all the replies. I notice also that I can force changing of > the value (as opposed to the reference) by substituting a trivial > expression for the right-hand side of the assignment > Incorrect. x = "hello" y = x + "" puts x.object_id puts y.object_id --output:-- 2152313980 2152313940 The + operator up there is the name of a String method in ruby: x = "hello" y = x.+("") puts x.object_id puts y.object_id --output:-- 2152313980 2152313940 You are going to have to get used to the fact that: 1) Strings are mutable in ruby. 2) Some methods in the String class mutate their "receiver"(i.e. the object that called the method), and others methods in the String class return a new String object. If you are not sure what a method returns, then check the docs: http://www.ruby-doc.org/core-1.9.3/String.html#method-i-2B Writing something like the following to create a new String object: y = x + "" works, but it is code obfuscation. Ruby methods usually have names that are descriptive and alert the reader what they do--use them.
on 2012-12-24 13:02
On Sun, Dec 23, 2012 at 7:40 AM, 7stud -- <lists@ruby-forum.com> wrote:
> Quote marks are a String object constructor in ruby.
Maybe a bit more illustrative: executing the _same_ string literal
results in multiple different instances:
irb(main):001:0> 4.times { puts "foo".object_id }
73580460
73580430
73580410
73580370
=> 4
Kind regards
robert
on 2012-12-24 13:13
On Mon, Dec 24, 2012 at 12:55 AM, Paul Magnussen <lists@ruby-forum.com> wrote: > => myStringA = "Fred Shufflebotham" > => myStringB = "Bert Shufflebotham" > > But of course it's an utter kludge. Is there really no more elegant > way? my_string_a = "Fred Shufflebotham" my_string_b = my_string_a.dup Note that in Ruby naming convention of local variables and method names is not CamelCase but snake_case. Paul, what you should take away from this discussion (I'll try to summarize what other's have said already): - All variables hold _references_ to objects.* - Assignment copies an object reference and stores it in a variable. - String literals are really object constructors, i.e. they create a new object whenever evaluated. (Don't worry, behind the scenes this is made efficient.) - There are immutable classes (most numeric classes, nil, TrueClass...) and mutable classes (all others including String). - Arithmetic operators return a reference to a new instance in order to make math work properly (a + b + a would return wrong results if the first + changed state of a and returned a reference to the mutated a). * Note that this is not completely true in terms of the _implementation_ of MRI but it is true from the perspective of the _language user_. Kind regards robert
on 2012-12-24 14:01
Robert Klemme wrote in post #1090092: > - All variables hold _references_ to objects.* And this is a huge breath of fresh air compared to, say, Perl, where arrays and arrayrefs are two different types of value, similarly hashes and hashrefs, and a whole bunch of other special cases. In ruby, *all* values are references to objects. Even integers. >> a = -3 => -3 >> a.to_s => "-3" >> a.abs => 3 So consistently: - everything is pass-by-value - every value is a reference to an object But as you have discovered, many objects are mutable, including strings.
on 2012-12-24 15:43
Wow, everbody has been so kind and helpful to a newbie. I shall save all this stuff off. Meantime, thank you all and Merry Christmas!
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.