Re: how to avoid passing by reference and how to copy objects

stickstone · August 28, 2008, 9:40am

Its basically like (probably IS under the hood) Pointers.
so for some arrays a and b

a = b

thats basically copying the pointers themselves. So whatever changes i
make in A will be reflected in B.

Roughly. The Array itself exists on the heap independently of a and b; a
and
b are local variables, basically just slots in the stack which contain a
reference to the Array.

The difference is that local variables themselves are not objects, and
you
cannot take a “reference” to a or b.

So whereas in C you could write

void *p = malloc(50);
void *q = p; # q is a another pointer to the malloc space
void **r = &p; # r points to the pointer p <<<< NOTE

in Ruby you can only do

p = “x” * 50
q = p

This actually makes life simpler. If you call

foo§

then the method foo can mutate the object p, but on return p will
definitely
be unchanged, i.e. is a reference to the same object. You cannot change
what
p references, unless you explicitly return a new object, e.g.

p = foo§

Of course, the biggest simplification over C is garbage collection. The
Array will be garbage collected once nothing else contains a reference
to it
(either a local variable or another live object)

a = b.dup

create some NEW pointers but which point to the same memory space.

No. It allocates new memory space and copies the contents of the space
referenced by ‘a’ into the new space.

However, that space may in turn contain pointers (references) to other
memory spaces (objects). Since this is a direct copy, the new object
contains the same references as the old object.

obj1 = “foo”
obj2 = “bar”
a = [obj1, obj2] # a points to an Array which contains
&obj1,&obj2
b = a.dup # b points to a different Array which
contains
# &obj1, &obj2

In the latter case you could have written b = [a[0], a[1]], or
b = [obj1, obj2], and got the same result: a new array, which contains
the
same pointers as the old array.

You can modify the array referenced by b - e.g. by adding or replacing
an
element - and this won’t affect the other array referenced by a. But if
you
follow b[0] to find obj1, and modify obj1, then obj1 is changed.
Subsequently accessing the object referred to by a[0] or b[0] will find
the
same changed obj1.

Use the “object_id” method to see if you’re pointing to the same “memory
space”.

irb(main):001:0> a = [“abc”,“def”]
=> [“abc”, “def”]
irb(main):002:0> a.object_id
=> -605609816
irb(main):003:0> b = a.dup
=> [“abc”, “def”]
irb(main):004:0> b.object_id
=> -605638770

You can think of object_id as a kind of encoded pointer.

HTH,

Brian.

stickstone · August 28, 2008, 10:06am

When i learnt C (10 years ago
so its rusty) I was told to use pass by reference only when necessary

True. But in Ruby you’re not passing values by reference; you’re passing
references by value!

Remember in C it’s common to do

struct foo *bigthing = malloc(sizeof(struct foo));

From that point onwards, you pass around the pointer ‘bigthing’:

void print_foo(struct foo *t)
{
…
}

…
print_foo(bigthing);

Here you are passing a pointer by value. That’s exactly what you’re
doing in
Ruby. The caller cannot modify the variable ‘bigthing’ so it points to
something else, but they could modify the structure which it points to.
(*)

Think of Ruby objects as ‘structs’ if you like. K&R C didn’t allow
passing
structs by value at all. ANSI C does, but it’s inefficient and usually
frowned upon:

void print_foo(struct foo t) /* could be a big stack frame! */

…
print_foo(*bigthing);

In any case, passing a struct by value will have the same problem you
describe, if it in turn contains pointers to other objects. e.g.

struct foo {
struct bar *header;
struct baz *footer;
}

Even if you pass bigthing (an instance of struct foo) by value, which
means
that your ‘header’ and ‘footer’ pointers can’t be changed, the header
and
footer objects themselves can still be modified in-place.

So switching to ruby where pass by reference is the only choice seems to
be making my programming a little less smooth.

On the contrary - in Ruby where everything is a reference passed by
value,
this choice no longer has to be made, and everything becomes wonderfully
regular.

Im just used to knowing
that the original value wont be changed.

Only in certain limited cases (i.e. passing simple integers, and structs
which contain nothing but integers). Most useful programs need data
structures which include pointers.

In any case, the same applies to Ruby numeric types, as they are
immutable
objects.

def foo(my_int) # my_int is a local copy of the object reference
my_int = 4 # now my_int points to a different object
end

a = 1
foo(a)
puts a # a still contains 1 (i.e. a reference to the Fixnum object
“1”)

That is: you can’t “change the number 1”. You can only change a local
variable to reference a different number. And because references are
copied
(passed by value), the caller is still referencing the original number.

When I first came across Ruby, things like this used to worry me (also
things like types and variable names not being checked at “compile”
time).
My advice is: go with the flow. Try doing things differently. Then you
may
find out that the benefits of doing it this way in practice outweigh the
risks you perceive initially when coming from a different way of
thinking
and working.

If after trying it for real this still worries you, use Erlang

HTH,

Brian.

(*) I’m ignoring the possibility of “const” declarations here. The same
limitations apply: even if you declare

void print_foo(const struct foo *t)

then this only protects you for one level. If *t contains pointers to
other
objects, then you can’t change the pointers, but you can change the
things
pointed to.

In any case, all bets are off if the function you’re calling uses casts.