Should Array.new(n, obj) be deprecated?

jasonmorrison · August 29, 2006, 3:01am

It seems well known that the Array constructor form

Array.new(n, obj)

causes a common gotcha for programmers new to ruby (see * below for
details). My question: is there ever any reason to use this form in
favor of

Array.new(n) {block}

One time I can imagine someone using the first form is if they wanted
to fill an array with some default value, such as 0, but generally,
default or null values are singletons, or have immediate value. There
is only one nil, and only one 0, and these tend to be immutable. I
can’t do anything to change an instance of 0 or nil that would have
surprising side effects. This also means that I could just as easily
write

Array.new(n) {0}

-or-

Array.new(n) { NullObject.instance }

The only way I can see preferring the first method is if you want to
be able to do something to one element of an array that affects the
other elements of the array. Is there ever a good reason to do this?
If, for some reason I can’t imagine, you wanted to get the gotcha
behavior out of the second form–maybe you want an array to initially
be filled with the string ‘a’, but you want the flexibility to easily
decide later that the default really should have been ‘b’–you can get
this behavior from the second form without too much difficulty using a
local variable. Concretely, I might use the first form to do

arr = Array.new(6, ‘a’)
arr[0].gsub!(/a/,‘b’)

so my array went from being filled with 6 references to the same
instance of ‘a’ to being filled with 6 references to the same instance
of ‘b’. I could duplicate this strange behavior using the second form
by writing

default = ‘a’
arr = Array.new(6) {default}
arr[0].gsub!(/a/,‘b’)

Now that’s a little bit harder than with the first form, but that’s
the point. It seems like a weird and dangerous thing to do, so it
should be hard, and you should have to be explicit. In light of the
above, maybe Array.new(n, obj) should be deprecated, since it is both
confusing, and can be replicated using the block form with only slight
gymnastics.

When I first started experimenting with ruby about one year ago, the
first thing I did was try to port a version of Conway’s Game of Life
into ruby, which I had written in Java for a comp. sci. class. I
wanted a grid of Cells, and tried to implement it as an array of rows,
each containing an array of Cells. This gotcha got me–in a big way.
“Ruby looks so simple on paper,” I thought, “but if I get tripped up
trying to write the game of life–what’s the point.” I didn’t bother
reading about ruby any more for a few months. Then I needed to do
some heavy regex lifting, and thankfully decided to give Ruby another
try. I’ve since fallen in love with the language, and figured out
what I was doing wrong, but I might not have been so lucky.

Cheers,

Jason M.

*Common Array Gotcha:

If I want to make an array of 5 hashes, or of containers in general,
if I only skimmed the documentation, I might try to do

menus = Array.new(6, Hash.new)

Of course, the array is filled with a single instance of Hash, so if I
do

menus[0][‘dessert’] = ‘cake’
menus[1][‘dessert’] = ‘ice cream’

puts menus[0][‘dessert’]
=> ‘ice cream’

This is because menus[0] and menus[1] actually refer to the same
object, and it’s being overwritten in the second statement.

What I meant to do was

menus = Array.new(6) {Hash.new}, which executes the block once for
each slot in the array, thus producing 6 new instances of hash.

jasonmorrison · August 29, 2006, 3:47am

Hi –

On Tue, 29 Aug 2006, Jason M. wrote:

menus[1][‘dessert’] = ‘ice cream’
each slot in the array, thus producing 6 new instances of hash.
I think everyone does this once or twice. It’s one of the nuby rites
of passage

I would put it in the category of “things that you might not think of
spontaneously, but that make sense in relation to how the rest of Ruby
works, once you see what they do.” Those things should not, I think,
be removed from the language; it’s OK for Ruby to have some learning
curve, and to be optimized for people who are planning to spend some
time with it.

The things I don’t like are the ones in the category of “things that
work a certain way without any evident basis in how the rest of Ruby
works or in common-sense semantics.” It’s a small category (since
Ruby is well-designed), but it includes, for example,
instance_methods(true/false).

David

jasonmorrison · August 29, 2006, 4:15am

“Jason M.” [email protected] wrote in message
news:[email protected]…

It seems well known that the Array constructor form

Array.new(n, obj)

causes a common gotcha for programmers new to ruby (see * below for
details). My question: is there ever any reason to use this form in
favor of

Array.new(n) {block}

From what I've been told, there are performance trade-offs.  The

former
is faster than the latter so there are times when it’s it’s preferable
if
it does the job. Otherwise, no, there’s no other reason that I can
tell…
Does that mean we should deprecate it? I think the pros out weigh
the
cons so I don’t think so…

jasonmorrison · August 29, 2006, 4:25am

it’s OK for Ruby to have some learning
curve, and to be optimized for people who are planning to spend some
time with it.

I agree with this statement completely, and I think you’re right that
many of the things new users complain about tend to be the way they
are because it makes life better for experienced users; however, in
this particular case, I don’t think the form of the constructor that I
was complaining about is optimized for experienced users. I’d be
interested to hear about any time someone wants to fill an array with
references to a single instance of a mutable object–I have a feeling
that there aren’t many.

And to correct a minor typo, in my explanation of the common gotcha,
when I said

If I want to make an array of 5 hashes, …

I should have said

If I want to make an array of 6 hashes, …

in order to be consistent with the rest of the example.

jasonmorrison · August 29, 2006, 5:23am

From what I’ve been told, there are performance trade-offs.

Point taken:

bm(12) do |test|
test.report(“firstway:”) do
a1 = Array.new(10000000,0)
end
test.report(“secondway:”) do
a2 = Array.new(10000000) {0}
end
end

=>
user system total
real
firstway: 0.110000 0.020000 0.130000 ( 0.130000)
secondway: 6.199000 0.030000 6.229000 ( 6.289000)

So there probably is a reason to have a hook to the fast way (that is,
the first way). I wish it were harder to do it the fast way and
easier to do it the slow way with more obvious behavior, but I
certainly wouldn’t want to change the block syntax, since it is nicely
consistent with the language, and I don’t know how I would change the
syntax of the fast and confusing way.

Anyways, thanks for helping me understand this design decision.
Status quo it is. I guess it’s pretty likely that I’ll make the
newbie mistake again myself, and in fairness, the behavior of the
constructors is pretty clearly documented.

What wrong with

[obj] * n

I didn’t know that was possible, but it produces the same behavior
that I thought was stange that the first constructor method produces.
Namely:

arr = [‘a’]*6
arr[0].gsub!(/a/,‘b’)

Now arr contains six references to the same instance of ‘b’.

Cheers,

Jason

jasonmorrison · August 29, 2006, 5:27am

Status quo it is. I guess it’s pretty likely that I’ll make the
newbie mistake again myself, and in fairness, the behavior of the
constructors is pretty clearly documented.

Pretty unlikely that I’ll make the newbie mistake again, I mean.

jasonmorrison · August 29, 2006, 4:27am

Just Another Victim of the Ambient M. wrote:

Array.new(n) {block}
From what I've been told, there are performance trade-offs.  The former
is faster than the latter so there are times when it’s it’s preferable if
it does the job. Otherwise, no, there’s no other reason that I can tell…
Does that mean we should deprecate it? I think the pros out weigh the
cons so I don’t think so…

What wrong with

[obj] * n

T.

jasonmorrison · August 29, 2006, 6:47am

E. Mark Ping wrote:

In article [email protected],
Trans [email protected] wrote:

What wrong with

[obj] * n

It has the same behavior as in the first post. Try it with a hash.

That’s my point. It does the same thing is a much more commonluy used
idiom --becasue it is symbolically recogizable. The constructor on the
other hand has no inherint sense.

T.

jasonmorrison · August 29, 2006, 6:51am

Trans wrote:

idiom --becasue it is symbolically recogizable. The constructor on the
other hand has no inherint sense.

Let me complete that thought. What does make sense of course is:

Array.new( obj0, obj1, obj2, … )

T.

jasonmorrison · September 1, 2006, 7:23pm

Jason M. wrote:

It seems well known that the Array constructor form

Array.new(n, obj)

causes a common gotcha for programmers new to ruby (see * below for
details). My question: is there ever any reason to use this form in
favor of

Array.new(n) {block}

If them silly functional languages are to be trusted, constants, and
pure functions of no arguments are interchangeable.

There’s no language-wise reason to use a constant instead of a lambda
that always returns that constant. Except that it’s adding epicycles -
the former is easier to read in most cases.

And oh yes. Overgeneralising horribly, blocks are slow(er than the
alternative in any case).

David V.

jasonmorrison · August 29, 2006, 5:27am

In article [email protected],
Trans [email protected] wrote:

What wrong with

[obj] * n

It has the same behavior as in the first post. Try it with a hash.