Could initialize return an existing object instead of a new instance?

aris · September 25, 2012, 8:13pm

Is it possible for initialize to return an existing object instead of a
new instance?

Specifically curious about whether it would be possible to implement a
StringPool in Ruby such that the same object with same object_id would
be returned if you tried to construct with the same value, unless a bang
method (like chomp!) were called on the object in which case instead of
returning the same object instance from the bang method, it would return
a different object instance. I told someone that this would be a massive
change since it would break these two assumptions:

s = “Hello”
s2 = “Hello”

s.object_id != s2.object_id

s = “Hello”
original_object_id = s.object_id
s.chomp!(‘o’)

original_object_id == s.object_id

But, maybe if it were possible to develop this as a gem, people could
use at their own risk.

Also, it just seemed like a neat idea (not that it would work in the
String case?) if you could do something in the initializer to abandon
the instance that was being created and instead return a different
object from the initializer, like one that already exists. Even if this
isn’t possible currently, what issues would one run into trying to
implement support for this? Thanks!

pfharlock · September 25, 2012, 8:38pm

Is it possible for initialize to return an existing object instead
of a new instance?

class Dog
def initialize
return ‘hello’
end
end

d = Dog.new
p d

–output:–
#<Dog:0x00000100925648

pfharlock · September 25, 2012, 11:59pm

Bartosz Dziewoński wrote in post #1077524:

Is that an academic discussion or an attempt for practical
optimization? Because Ruby sorta kinda does this internally.

http://patshaughnessy.net/2012/1/4/never-create-ruby-strings-longer-than-23-characters

Seeing double: how Ruby shares string values - Pat Shaughnessy

Thanks! Wasn’t academic. There was a discussion on the rails-core list
where someone was complaining about the massive numbers of objects
allocated:
https://groups.google.com/forum/?fromgroups=#!topic/rubyonrails-core/jFlXnFA4rP8

I just posted those links there.

The context was attribute names which primarily were < 23 characters I
guess, so didn’t benefit from that optimization. I think they merged
Jeremy E.'s patch in after much discussion:

I was just considering something that could pool all strings, even
though it looks like that would slow things down a good bit.

Also, I was trying to learn more about it. Mission accomplished! Thanks!

pfharlock · September 25, 2012, 10:07pm

2012/9/25 Gary W. [email protected]:

Specifically curious about whether it would be possible to implement a
StringPool in Ruby such that the same object with same object_id would
be returned if you tried to construct with the same value, unless a bang
method (like chomp!) were called on the object in which case instead of
returning the same object instance from the bang method, it would return
a different object instance. I told someone that this would be a massive
change since it would break these two assumptions:

Is that an academic discussion or an attempt for practical
optimization? Because Ruby sorta kinda does this internally.

http://patshaughnessy.net/2012/1/4/never-create-ruby-strings-longer-than-23-characters
http://patshaughnessy.net/2012/1/18/seeing-double-how-ruby-shares-string-values

– Matma R.

pfharlock · September 27, 2012, 3:26pm

2012/9/26 Gary W. [email protected]:

The context was attribute names which primarily were < 23 characters I
guess, so didn’t benefit from that optimization. I think they merged
Jeremy E.'s patch in after much discussion:
Freeze columns before using them as hash keys by jeremyevans · Pull Request #7631 · rails/rails · GitHub

Oh wow. This is off-topic, but I’ve just gotta say that apparently
most people coding up Rails hardly know Ruby (or maybe they just code
and comment while intoxicated). It’s like nobody except for Jeremy and
that wlipa guy know what they are talking about, even after being
presented a good rationale for this change. Man.

– Matma R.

pfharlock · September 25, 2012, 8:57pm

7stud – wrote in post #1077511:

Is it possible for initialize to return an existing object instead
of a new instance?

class Dog
def initialize
return ‘hello’
end
end

d = Dog.new
p d

–output:–
#<Dog:0x00000100925648

Thanks, but figured it out:

Can override the new method on the class to get the behavior I was
talking about, e.g.

class Dog
BARK = ‘bark’
def self.new(arg=nil)
return BARK
end
end
=> nil
Dog.new.object_id
=> 2046
Dog.new.object_id
=> 2046

nice!

pfharlock · September 27, 2012, 4:53pm

Bartosz Dziewoński wrote in post #1077768:

2012/9/26 Gary W. [email protected]:

The context was attribute names which primarily were < 23 characters I
guess, so didn’t benefit from that optimization. I think they merged
Jeremy E.'s patch in after much discussion:
https://github.com/rails/rails/pull/7631

Oh wow. This is off-topic, but I’ve just gotta say that apparently
most people coding up Rails hardly know Ruby (or maybe they just code
and comment while intoxicated). It’s like nobody except for Jeremy and
that wlipa guy know what they are talking about, even after being
presented a good rationale for this change. Man.

– Matma R.

Sorry, but don’t count me in group that is coding up Rails. Not
saying that you aren’t right, but I think there are a number of people
that post to the core list that aren’t on the core team. From what I
read the purpose of the core list is for people to post concerns, bugs,
new features, news, etc. about Rails or “how does X work in Rails?” and
the “how do I do X in Rails” goes to the Rails user list, vs. the
typical “*-dev” list. Ok, I’m defending them now- feel free to go off on
them. Apparently are some smart guys there though, so don’t let them
get confused with the others. Thanks a ton for your help with helping me
understand this, though. Ruby is cooler every day, and I need all the
help I can get!

pfharlock · September 27, 2012, 6:39pm

On Tue, Sep 25, 2012 at 8:13 PM, Gary W. [email protected]
wrote:

Is it possible for initialize to return an existing object instead of a
new instance?

First of all, it’s totally meaningless what #initialize returns.
Object creation is done inside the class’s method #new. #initialize
is just an instance method which happens to be invoked on a newly
allocated object (that method exists as well). So basically you can
imagine it to work like this

class Class
def new(*a, &b)
x = allocate
x.send(:initialize, *a, &b)
x
end
end

So, yes, you can modify a class’s method #new to return anything you
like.

Side note, if you implement that method on your own I find this a tad
more elegant:

class Class
def new(*a, &b)
allocate.tap {|x| x.send(:initialize, *a, &b)}
end
end

Specifically curious about whether it would be possible to implement a
StringPool in Ruby such that the same object with same object_id would
be returned if you tried to construct with the same value,

Easy:

string_pool = Hash.new {|h,s| h[s.freeze] = s}

irb(main):002:0> 5.times { puts string_pool[“foo”].object_id}
-1072382248
-1072382248
-1072382248
-1072382248
-1072382248
=> 5
irb(main):003:0> 5.times.map { string_pool[“foo”].object_id }.uniq
=> [-1072382248]

Note: the call of #freeze is necessary since a Hash will dup a mutable
String key to avoid aliasing effects. That’s a feature of Hash.

unless a bang
method (like chomp!) were called on the object in which case instead of
returning the same object instance from the bang method, it would return
a different object instance. I told someone that this would be a massive
change since it would break these two assumptions:

Right. That’ll be difficult since behavior of String instances would
need to be changed if they are in the pool. Hard to do and probably
not efficient.

use at their own risk.

Also, it just seemed like a neat idea (not that it would work in the
String case?) if you could do something in the initializer to abandon
the instance that was being created and instead return a different
object from the initializer, like one that already exists. Even if this
isn’t possible currently, what issues would one run into trying to
implement support for this? Thanks!

See above, you need to modify your class’s #new method. Silly example:

irb(main):006:0> class George
irb(main):007:1> attr_reader :num
irb(main):008:1> def initialize(x) @num = x.to_int end
irb(main):009:1> end
=> nil
irb(main):010:0> 3.times.map { George.new(1).object_id }
=> [-1072296698, -1072296708, -1072296718]
irb(main):011:0> 3.times.map { George.new(1).object_id }.uniq
=> [-1072308808, -1072308818, -1072308828]

OK, regular case: we get a new George for every int. Now we create the
pool:

irb(main):013:0> class <<George
irb(main):014:1> alias _new new
irb(main):015:1> def new(x)
irb(main):016:2> @objects[x.to_int] ||= _new(x)
irb(main):017:2> end
irb(main):018:1> end
=> nil

Create the Array which is the pool:

irb(main):019:0> class George
irb(main):020:1> @objects=[]
irb(main):021:1> end
=> []

Now let’s see:

irb(main):022:0> 3.times.map { George.new(1).object_id }
=> [-1072359978, -1072359978, -1072359978]
irb(main):023:0> 3.times.map { George.new(1).object_id }.uniq
=> [-1072359978]

Ah!

Note: all sorts of issues have to be considered which might be
introduced by this, i.e. memory leaks, concurrency issues etc. This
approach works best for immutable objects of course. Even then the
pool needs to be properly synchronized.

Kind regards

robert

pfharlock · September 27, 2012, 7:57pm

2012/9/27 Gary W. [email protected]:

Sorry, but don’t count me in group that is coding up Rails.

Oh no, of course I’m not

Not
saying that you aren’t right, but I think there are a number of people
that post to the core list that aren’t on the core team.

I was only referring to the discussion under that pull request.

– Matma R.