ObjectSpace._id2ref is another of those peculiar methods, an artifact
of a particular implementation which, due to its lack of a
copying/compacting garbage collector, can always locate in memory an
object given its “id”. This is typically not easily possible on
other VMs, where objects move around and it may even be difficult to
get a unique “id” for a given object since memory locations keep
moving and adding a numeric ID would increase object or object handle
sizes.
On JRuby, _id2ref is implemented as a pair with Object#object_id/id.
The latter, when called on an object, atomically constructs a numeric
ID for the object in question. It then asks our ObjectSpace
implementation to insert a weak reference to the object into a table
keyed on numeric ID. This allows the resulting ID to be used later for
_id2ref to retrieve the object.
Unfortunately object_id, in its #id form, is often used to get a
unique non-#hash key for an object for purposes entirely unrelated to
_id2ref. As a result, any code using object_id or id on JRuby pays a
significantly higher cost than you might expect.
If we no longer supported _id2ref, the only cost would be in producing
an ID, probably with a strictly-increasing atomic 64-bit value. There
would be no weakref map and no cost of constructing and managing the
weakrefs within that map.
So I am asking you Rubyists…does this sound like a problem? In the
1.8/1.9 stdlib, the only reference to _id2ref is one in drb.rb, which
could be replaced with a “better way”. None of the gems I have
installed use _id2ref. Originally, weakref.rb used _id2ref, but we
have a native impl of weakref that uses Java’s built-in weakrefs.
Google code search only brings up about 353 hits for “lang:ruby
_id2ref”, most of them the already-mentioned cases.
One last demonstration of the perf difference between the current
Object#object_id and one that does not use the ObjectSpace weak map:
Current:
user system
total real
1M calls to obj.object_id 0.658000 0.000000
0.658000 ( 0.658000)
1M calls to Object.new.object_id 6.636000 0.000000
6.636000 ( 6.636000)
Using object’s “identity hash”:
user system
total real
1M calls to obj.object_id 0.356000 0.000000
0.356000 ( 0.356000)
1M calls to Object.new.object_id 0.636000 0.000000
0.636000 ( 0.636000)
It’s also interesting to note that even maintaining the contract of
object_id being unique is hard. On the JVM, for example, it is not
possible to get a unique numeric id or pointer for a given object
unless you manage a weak map of objects on your own…
- Charlie