Problem with weak references on OS X 10.3

I am having problems with weak references. The program below exhibits
the problem:

100_000.times{|n|
o=Object.new;
i=o.id;
o2=ObjectSpace._id2ref(i);
o.equal? o2 or raise “o=#{o}, i=#{”%x"%i}, o2=#{o2.inspect}, n=#{n}"
}

The exception should never be raised. On my OS X 10.3.9 system (and at
least 1 other) it does get eventually raised after a few hundred
iterations using ruby 1.8 and 1.9. With the (apple-supplied) ruby 1.6,
it does not happen. Tests on several Windows and Linux systems have
never observed a problem, using ruby 1.8 and 1.9. I don’t know if it’s
a problem on OS X 10.4; I don’t have access to any 10.4 systems.

The problem seems to be in the call to id. Usually, it works
correctly, but every once in a while it returns the id of some random
symbol. Does anyone know why this is happening?

Caleb C. [email protected] wrote:

The exception should never be raised. On my OS X 10.3.9 system (and at
least 1 other) it does get eventually raised after a few hundred
iterations using ruby 1.8 and 1.9. With the (apple-supplied) ruby 1.6,
it does not happen. Tests on several Windows and Linux systems have
never observed a problem, using ruby 1.8 and 1.9. I don’t know if it’s
a problem on OS X 10.4; I don’t have access to any 10.4 systems.

The problem seems to be in the call to id. Usually, it works
correctly, but every once in a while it returns the id of some random
symbol. Does anyone know why this is happening?

I’m a bit confused: where are the WeakReferences your subject mentions?
Also, on my 1.8.3 on cygwin this runs without a problem. If the code
throws
then I presume there is a problem with the Ruby interpreter you use
(platform induced int overflow?).

Kind regards

robert

On 2/4/06, Robert K. [email protected] wrote:

I’m a bit confused: where are the WeakReferences your subject mentions?
Also, on my 1.8.3 on cygwin this runs without a problem. If the code throws
then I presume there is a problem with the Ruby interpreter you use
(platform induced int overflow?).

The call to id creates the weak reference. Anyway, I consider it a
weak reference, even though there’s no WeakRef involved; perhaps you
don’t. (id is what WeakRef uses internally.)

I now see that I also get the problem with my ruby 1.6 if I run the
test program within irb; without irb, it runs without problems.

I’ve also tried a variant that creates an actual WeakRef (calling
WeakRef.new and #getobj instead of id and
ObjectSpace._id2ref); it does not (AFAICT) get the same error, but
instead a different one, which also seems like it shouldn’t happen.
Here’s the modified script:

'require ‘weakref’
100_000.times{|n|
o=Object.new;
i=WeakRef.new o;
o2=ObjectSpace._id2ref(i.getobj);
o.equal? o2 or raise “o=#{o}, i=#{”%x"%i}, o2=#{o2.inspect}, n=#{n}"
}

And the error I get:
weakref_bug.rb:5:in _id2ref': cannot convert Object into Integer (TypeError) from weakref_bug.rb:5 from weakref_bug.rb:2:in times’
from weakref_bug.rb:2

I agree that it does seem to be a problem with the interpreter.

2006/2/4, Caleb C. [email protected]:

On 2/4/06, Robert K. [email protected] wrote:

I’m a bit confused: where are the WeakReferences your subject mentions?
Also, on my 1.8.3 on cygwin this runs without a problem. If the code throws
then I presume there is a problem with the Ruby interpreter you use
(platform induced int overflow?).

The call to id creates the weak reference. Anyway, I consider it a
weak reference, even though there’s no WeakRef involved; perhaps you
don’t. (id is what WeakRef uses internally.)

You’re right - I don’t. Object#id returns an object id.

I now see that I also get the problem with my ruby 1.6 if I run the
test program within irb; without irb, it runs without problems.

I would not count on IRB in such circumstances - especially if local
variables are involved. IRB does certain things differently there. Did
you only test in IRB or also in a Ruby script?

i=WeakRef.new o;
I agree that it does seem to be a problem with the interpreter.
Not so fast. This error you are seeing is absolutely expected:
i.getobj returns the original instance. If that is not an object
id (which it isn’t in your case) it’s not a legal argument for
ObjectSpace._id2ref().

You probably wanted o2=i.getobj

Since you keep a reference to o all the time in the block,
ObjectSpace._id2ref must always return the same instance. If you
actually see the error you claimed you saw initially then there’s
something seriously broken. At the moment I rather suspect it’s some
other issue (such as testing in IRB). I’d also try to use brackets
around the equality test - just to be sure that precedence doesn’t
come into play.

robert

Caleb C. [email protected] wrote:

create the problem on my ruby 1.6. Irb is not the problem; it doesn’t
and think.

something seriously broken. At the moment I rather suspect it’s some
other issue (such as testing in IRB). I’d also try to use brackets
around the equality test - just to be sure that precedence doesn’t
come into play.

I’m pretty sure about the precedence of or, but just in case, I tried
it with more parens. It’s still broken.

I was also, but sometimes it’s better to explicitely rule potential
sources
of error out.

I have to admit I still cannot believe that you actually saw the results
you
claimed to see initially. Can anybody verify this on Mac OS please? I
don’t have a Mac around otherwise I’d do it. I’ve attached an
equivalent
version of the script.

Kind regards

robert

Caleb C. [email protected] writes:

it does not happen. Tests on several Windows and Linux systems have
never observed a problem, using ruby 1.8 and 1.9. I don’t know if it’s
a problem on OS X 10.4; I don’t have access to any 10.4 systems.

The problem seems to be in the call to id. Usually, it works
correctly, but every once in a while it returns the id of some random
symbol. Does anyone know why this is happening?

I can reproduce on ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]:

o=#Object:0x1d421c, i=ea10e, o2=:reject, n=448 (RuntimeError)

It looks like the object id wrapped in some way and now points to a
symbol? Clearly looks like a bug.

On 2/4/06, Robert K. [email protected] wrote:

I now see that I also get the problem with my ruby 1.6 if I run the
test program within irb; without irb, it runs without problems.

I would not count on IRB in such circumstances - especially if local
variables are involved. IRB does certain things differently there. Did
you only test in IRB or also in a Ruby script?

It happens running it with plain ruby (no irb) on my ruby 1.8 (and
1.9). I only mentioned it because irb does seem to be required to
create the problem on my ruby 1.6. Irb is not the problem; it doesn’t
treat local variables that differently.

Not so fast. This error you are seeing is absolutely expected:
i.getobj returns the original instance. If that is not an object
id (which it isn’t in your case) it’s not a legal argument for
ObjectSpace._id2ref().

You probably wanted o2=i.getobj

Uh-oh. You’re right. Too much monkey code and hack, not enough look and
think.

(After hurriedly fixing my test…) Ok, so if I correctly use
WeakRefs, there is no problem. That is interesting, and I’d sure like
to know why, because it’s not obvious to me. I’m going to investigate
this deeper, and see if I isolate the difference that lets WeakRef
work.

Since you keep a reference to o all the time in the block,
ObjectSpace._id2ref must always return the same instance. If you
actually see the error you claimed you saw initially then there’s
something seriously broken. At the moment I rather suspect it’s some
other issue (such as testing in IRB). I’d also try to use brackets
around the equality test - just to be sure that precedence doesn’t
come into play.

I’m pretty sure about the precedence of or, but just in case, I tried
it with more parens. It’s still broken.

Christian N. [email protected] wrote:

at least 1 other) it does get eventually raised after a few hundred
I can reproduce on ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]:

o=#Object:0x1d421c, i=ea10e, o2=:reject, n=448 (RuntimeError)

It looks like the object id wrapped in some way and now points to a
symbol? Clearly looks like a bug.

Wow! Does it exhibit the same behavior with #object_id instead of
#id?
Guess so… Now we just have to figure whether the bug is in #id or
_id2ref. Somehow I suspect it’s the former…

robert

On Sun, Feb 05, 2006 at 08:33:40PM +0900, Christian N. wrote:

least 1 other) it does get eventually raised after a few hundred

o=#Object:0x1d421c, i=ea10e, o2=:reject, n=448 (RuntimeError)

It looks like the object id wrapped in some way and now points to a
symbol? Clearly looks like a bug.

0x1d421c.to_s(2) # =>
“111010100001000011100”
0xea10e.to_s(2) # =>
“11101010000100001110”
0xea10e.class # => Fixnum
(2 * 0xea10e).to_s(2) # =>
“111010100001000011100”

So far so good.

Now, in gc.c:

static VALUE
id2ref(obj, id)
VALUE obj, id;
{
unsigned long ptr, p0;

rb_secure(4);
p0 = ptr = NUM2ULONG(id);
if (ptr == Qtrue) return Qtrue;
if (ptr == Qfalse) return Qfalse;
if (ptr == Qnil) return Qnil;
if (FIXNUM_P(ptr)) return (VALUE)ptr;
if (SYMBOL_P(ptr) && rb_id2name(SYM2ID((VALUE)ptr)) != 0) {
return (VALUE)ptr;
}

(SYMBOL_FLAG == 0x0e)

NUM2ULONG is rb_num2ulong, which calls rb_num2long, which uses FIX2LONG.
id was 111010100001000011101b and ptr becomes 11101010000100001110b,
which
matches the SYMBOL_FLAG.

I’d conjecture that the above works on Linux because glibc’s malloc()
always
returns 8-byte aligned memory addresses, which doesn’t seem to be the
case in
OSX:

0x1d421c % 8 # => 4

Another possibility would be that the address space for the data segment
used in OSX is lower than on Linux, so the SYM2ID matches an existent
symbol:

RUBY_PLATFORM # => “i686-linux”
Object.new.inspect # =>
“#Object:0xb7d44d7c
0xb7d44d7c >> 9 # => 6023718

we shouldn’t have 6 million symbols

0x1d421c >> 9 # => 3745

but 4000 are indeed possible

The relevant code hasn’t changed between 1.6 and 1.8; could it be that
the
Apple-supplied 1.6 binary was built specially to use 8-byte alignment,
or
that the memory layout has changed in the meantime?

If so, possible fixes would include:

  • modifying the configure to use the magic options
  • using posix_memalign or such

On Feb 5, 2006, at 6:08 AM, Robert K. wrote:

I have to admit I still cannot believe that you actually saw the
results you claimed to see initially. Can anybody verify this on
Mac OS please? I don’t have a Mac around otherwise I’d do it.
I’ve attached an equivalent version of the script.

logan:/Users/logan/Projects/Ruby Experiments% ruby idref.rb
idref.rb:7: 152: #Object:0x1e861c :$@ - 1000206 1000206 (RuntimeError)
from idref.rb:3
logan:/Users/logan/Projects/Ruby Experiments% ruby -v
ruby 1.8.4 (2005-12-24) [powerpc-darwin8.4.0]
logan:/Users/logan/Projects/Ruby Experiments% uname -a
Darwin Logan-Capaldos-Computer.local 8.4.0 Darwin Kernel Version
8.4.0: Tue Jan 3 18:22:10 PST 2006; root:xnu-792.6.56.obj~1/
RELEASE_PPC Power Macintosh powerpc