Forum: Ruby Problem with weak references on OS X 10.3

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Caleb C. (Guest)
on 2006-02-04 18:38
(Received via mailing list)
I am having problems with weak references. The program below exhibits
the problem:

100_000.times{|n|
  o=Object.new;
  i=o.__id__;
  o2=ObjectSpace._id2ref(i);
  o.equal? o2 or raise "o=#{o}, i=#{"%x"%i}, o2=#{o2.inspect}, n=#{n}"
}

The exception should never be raised. On my OS X 10.3.9 system (and at
least 1 other) it does get eventually raised after a few hundred
iterations using ruby 1.8 and 1.9. With the (apple-supplied) ruby 1.6,
it does not happen. Tests on several Windows and Linux systems have
never observed a problem, using ruby 1.8 and 1.9. I don't know if it's
a problem on OS X 10.4; I don't have access to any 10.4 systems.

The problem seems to be in the call to __id__. Usually, it works
correctly, but every once in a while it returns the id of some random
symbol. Does anyone know why this is happening?
Robert K. (Guest)
on 2006-02-04 19:39
(Received via mailing list)
Caleb C. <removed_email_address@domain.invalid> wrote:
> The exception should never be raised. On my OS X 10.3.9 system (and at
> least 1 other) it does get eventually raised after a few hundred
> iterations using ruby 1.8 and 1.9. With the (apple-supplied) ruby 1.6,
> it does not happen. Tests on several Windows and Linux systems have
> never observed a problem, using ruby 1.8 and 1.9. I don't know if it's
> a problem on OS X 10.4; I don't have access to any 10.4 systems.
>
> The problem seems to be in the call to __id__. Usually, it works
> correctly, but every once in a while it returns the id of some random
> symbol. Does anyone know why this is happening?

I'm a bit confused: where are the WeakReferences your subject mentions?
Also, on my 1.8.3 on cygwin this runs without a problem.  If the code
throws
then I presume there is a problem with the Ruby interpreter you use
(platform induced int overflow?).

Kind regards

    robert
Caleb C. (Guest)
on 2006-02-04 22:14
(Received via mailing list)
On 2/4/06, Robert K. <removed_email_address@domain.invalid> wrote:
> I'm a bit confused: where are the WeakReferences your subject mentions?
> Also, on my 1.8.3 on cygwin this runs without a problem.  If the code throws
> then I presume there is a problem with the Ruby interpreter you use
> (platform induced int overflow?).

The call to __id__ creates the weak reference. Anyway, I consider it a
weak reference, even though there's no WeakRef involved; perhaps you
don't. (__id__ is what WeakRef uses internally.)

I now see that I also get the problem with my ruby 1.6 _if_ I run the
test program within irb; without irb, it runs without problems.

I've also tried a variant that creates an actual WeakRef (calling
WeakRef.new and #__getobj__ instead of __id__ and
ObjectSpace._id2ref); it does not (AFAICT) get the same error, but
instead a different one, which also seems like it shouldn't happen.
Here's the modified script:


'require 'weakref'
100_000.times{|n|
  o=Object.new;
  i=WeakRef.new o;
  o2=ObjectSpace._id2ref(i.__getobj__);
  o.equal? o2 or raise "o=#{o}, i=#{"%x"%i}, o2=#{o2.inspect}, n=#{n}"
}

And the error I get:
weakref_bug.rb:5:in `_id2ref': cannot convert Object into Integer
(TypeError)
        from weakref_bug.rb:5
        from weakref_bug.rb:2:in `times'
        from weakref_bug.rb:2

I agree that it does seem to be a problem with the interpreter.
Robert K. (Guest)
on 2006-02-04 23:56
(Received via mailing list)
2006/2/4, Caleb C. <removed_email_address@domain.invalid>:
> On 2/4/06, Robert K. <removed_email_address@domain.invalid> wrote:
> > I'm a bit confused: where are the WeakReferences your subject mentions?
> > Also, on my 1.8.3 on cygwin this runs without a problem.  If the code throws
> > then I presume there is a problem with the Ruby interpreter you use
> > (platform induced int overflow?).
>
> The call to __id__ creates the weak reference. Anyway, I consider it a
> weak reference, even though there's no WeakRef involved; perhaps you
> don't. (__id__ is what WeakRef uses internally.)

You're right - I don't. Object#__id__ returns an object id.

> I now see that I also get the problem with my ruby 1.6 _if_ I run the
> test program within irb; without irb, it runs without problems.

I would not count on IRB in such circumstances - especially if local
variables are involved. IRB does certain things differently there. Did
you only test in IRB or also in a Ruby script?

>   i=WeakRef.new o;
> I agree that it does seem to be a problem with the interpreter.
Not so fast. This error you are seeing is absolutely expected:
i.__getobj__ returns the original instance. If that is not an object
id (which it isn't in your case) it's not a legal argument for
ObjectSpace._id2ref().

You probably wanted o2=i.__getobj__

Since you keep a reference to o all the time in the block,
ObjectSpace._id2ref must always return the same instance. *If* you
actually see the error you claimed you saw initially then there's
something seriously broken.  At the moment I rather suspect it's some
other issue (such as testing in IRB).  I'd also try to use brackets
around the equality test - just to be sure that precedence doesn't
come into play.

robert
Caleb C. (Guest)
on 2006-02-05 07:49
(Received via mailing list)
On 2/4/06, Robert K. <removed_email_address@domain.invalid> wrote:
> > I now see that I also get the problem with my ruby 1.6 _if_ I run the
> > test program within irb; without irb, it runs without problems.
>
> I would not count on IRB in such circumstances - especially if local
> variables are involved. IRB does certain things differently there. Did
> you only test in IRB or also in a Ruby script?

It happens running it with plain ruby (no irb) on my ruby 1.8 (and
1.9). I only mentioned it because irb does seem to be required to
create the problem on my ruby 1.6. Irb is not the problem; it doesn't
treat local variables that differently.

> Not so fast. This error you are seeing is absolutely expected:
> i.__getobj__ returns the original instance. If that is not an object
> id (which it isn't in your case) it's not a legal argument for
> ObjectSpace._id2ref().
>
> You probably wanted o2=i.__getobj__

Uh-oh. You're right. Too much monkey code and hack, not enough look and
think.

(After hurriedly fixing my test...) Ok, so if I _correctly_ use
WeakRefs, there is no problem. That is interesting, and I'd sure like
to know why, because it's not obvious to me. I'm going to investigate
this deeper, and see if I isolate the difference that lets WeakRef
work.

> Since you keep a reference to o all the time in the block,
> ObjectSpace._id2ref must always return the same instance. *If* you
> actually see the error you claimed you saw initially then there's
> something seriously broken.  At the moment I rather suspect it's some
> other issue (such as testing in IRB).  I'd also try to use brackets
> around the equality test - just to be sure that precedence doesn't
> come into play.

I'm pretty sure about the precedence of or, but just in case, I tried
it with more parens. It's still broken.
Robert K. (Guest)
on 2006-02-05 13:09
(Received via mailing list)
Caleb C. <removed_email_address@domain.invalid> wrote:
> create the problem on my ruby 1.6. Irb is not the problem; it doesn't
> and think.
>> something seriously broken.  At the moment I rather suspect it's some
>> other issue (such as testing in IRB).  I'd also try to use brackets
>> around the equality test - just to be sure that precedence doesn't
>> come into play.
>
> I'm pretty sure about the precedence of or, but just in case, I tried
> it with more parens. It's still broken.

I was also, but sometimes it's better to explicitely rule potential
sources
of error out.

I have to admit I still cannot believe that you actually saw the results
you
claimed to see initially.  Can anybody verify this on Mac OS please?  I
don't have a Mac around otherwise I'd do it.  I've attached an
equivalent
version of the script.

Kind regards

    robert
Christian N. (Guest)
on 2006-02-05 13:33
(Received via mailing list)
Caleb C. <removed_email_address@domain.invalid> writes:

> it does not happen. Tests on several Windows and Linux systems have
> never observed a problem, using ruby 1.8 and 1.9. I don't know if it's
> a problem on OS X 10.4; I don't have access to any 10.4 systems.
>
> The problem seems to be in the call to __id__. Usually, it works
> correctly, but every once in a while it returns the id of some random
> symbol. Does anyone know why this is happening?

I can reproduce on ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]:

o=#<Object:0x1d421c>, i=ea10e, o2=:reject, n=448 (RuntimeError)

It looks like the object id wrapped in some way and now points to a
symbol?  Clearly looks like a bug.
Robert K. (Guest)
on 2006-02-05 14:16
(Received via mailing list)
Christian N. <removed_email_address@domain.invalid> wrote:
>> at least 1 other) it does get eventually raised after a few hundred
> I can reproduce on ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]:
>
> o=#<Object:0x1d421c>, i=ea10e, o2=:reject, n=448 (RuntimeError)
>
> It looks like the object id wrapped in some way and now points to a
> symbol?  Clearly looks like a bug.

Wow!  Does it exhibit the same behavior with #object_id instead of
#__id__?
Guess so...  Now we just have to figure whether the bug is in #__id__ or
_id2ref.  Somehow I suspect it's the former...

    robert
Mauricio F. (Guest)
on 2006-02-05 15:07
(Received via mailing list)
On Sun, Feb 05, 2006 at 08:33:40PM +0900, Christian N. wrote:
> > least 1 other) it does get eventually raised after a few hundred
>
> o=#<Object:0x1d421c>, i=ea10e, o2=:reject, n=448 (RuntimeError)
>
> It looks like the object id wrapped in some way and now points to a
> symbol?  Clearly looks like a bug.

0x1d421c.to_s(2)                                   # =>
"111010100001000011100"
0xea10e.to_s(2)                                    # =>
"11101010000100001110"
0xea10e.class                                      # => Fixnum
(2 * 0xea10e).to_s(2)                              # =>
"111010100001000011100"

So far so good.

Now, in gc.c:

static VALUE
id2ref(obj, id)
    VALUE obj, id;
{
    unsigned long ptr, p0;

    rb_secure(4);
    p0 = ptr = NUM2ULONG(id);
    if (ptr == Qtrue) return Qtrue;
    if (ptr == Qfalse) return Qfalse;
    if (ptr == Qnil) return Qnil;
    if (FIXNUM_P(ptr)) return (VALUE)ptr;
    if (SYMBOL_P(ptr) && rb_id2name(SYM2ID((VALUE)ptr)) != 0) {
	return (VALUE)ptr;
    }

(SYMBOL_FLAG == 0x0e)

NUM2ULONG is rb_num2ulong, which calls rb_num2long, which uses FIX2LONG.
id was 111010100001000011101b and ptr becomes 11101010000100001110b,
which
matches the SYMBOL_FLAG.

I'd conjecture that the above works on Linux because glibc's malloc()
always
returns 8-byte aligned memory addresses, which doesn't seem to be the
case in
OSX:

 0x1d421c % 8                                      # => 4

Another possibility would be that the address space for the data segment
used in OSX is lower than on Linux, so the SYM2ID matches an existent
symbol:

RUBY_PLATFORM                                      # => "i686-linux"
Object.new.inspect                                 # =>
"#<Object:0xb7d44d7c>"
0xb7d44d7c >> 9                                    # => 6023718
# we shouldn't have 6 million symbols
0x1d421c >> 9                                      # => 3745
# but 4000 are indeed possible

The relevant code hasn't changed between 1.6 and 1.8; could it be that
the
Apple-supplied 1.6 binary was built specially to use 8-byte alignment,
or
that the memory layout has changed in the meantime?

If so, possible fixes would include:
* modifying the configure to use the magic options
* using posix_memalign or such
Logan C. (Guest)
on 2006-02-06 07:29
(Received via mailing list)
On Feb 5, 2006, at 6:08 AM, Robert K. wrote:

> I have to admit I still cannot believe that you actually saw the
> results you claimed to see initially.  Can anybody verify this on
> Mac OS please?  I don't have a Mac around otherwise I'd do it.
> I've attached an equivalent version of the script.

logan:/Users/logan/Projects/Ruby Experiments% ruby idref.rb
idref.rb:7: 152: #<Object:0x1e861c> :$@ - 1000206 1000206 (RuntimeError)
         from idref.rb:3
logan:/Users/logan/Projects/Ruby Experiments% ruby -v
ruby 1.8.4 (2005-12-24) [powerpc-darwin8.4.0]
logan:/Users/logan/Projects/Ruby Experiments% uname -a
Darwin Logan-Capaldos-Computer.local 8.4.0 Darwin Kernel Version
8.4.0: Tue Jan  3 18:22:10 PST 2006; root:xnu-792.6.56.obj~1/
RELEASE_PPC Power Macintosh powerpc
This topic is locked and can not be replied to.