Forum: Ruby-core [ruby-trunk - Feature #8478][Open] The hash returned by Enumerable#group_by should have an empty arr

0d8600ea9d02baf8c84503b9410598ac?d=identicon&s=25 Pete Higgins (phiggy)
on 2013-06-02 21:38
(Received via mailing list)
Issue #8478 has been reported by phiggins (Pete Higgins).

----------------------------------------
Feature #8478: The hash returned by Enumerable#group_by should have an
empty array for its default value
https://bugs.ruby-lang.org/issues/8478

Author: phiggins (Pete Higgins)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:


Without this patch, nil checks might need to be done on the return value
of Enumerable#group_by:

$ cat test_group_by.rb
a = [1, 2, 3, "a", "b"]
g = a.group_by {|o| o.class }

puts "Fixnums: #{g[Fixnum].size}"
puts "Strings: #{g[String].size}"
puts "Arrays: #{g[Array].size}"

$ ruby test_group_by.rb
Fixnums: 3
Strings: 2
test_group_by.rb:6:in `<main>': undefined method `size' for nil:NilClass
(NoMethodError)

This patch adds a default value of an empty array to the hash returned
by Enumerable#group_by, so the script above will work:

$ ./ruby -I.:lib test_group_by.rb
Fixnums: 3
Strings: 2
Arrays: 0
0e610136db92027148906c92d57fdb36?d=identicon&s=25 marcandre (Marc-Andre Lafortune) (Guest)
on 2013-06-02 22:09
(Received via mailing list)
Issue #8478 has been updated by marcandre (Marc-Andre Lafortune).


I understand the idea, but there are problems with this.

First of, it's a really bad idea to set the default of a hash to a
mutable object:

    a = [1, 2, 3, "a", "b"]
    g = a.group_by {|o| o.class }
    g[Array] << [:foo]
    g[Hash] # => [[:foo]], you expected []

Assuming now that your request was instead to change the default proc to
{|hash, key| hash[key] = []}, which would not cause the problem above,
you still have the problem of incompatibility.
----------------------------------------
Feature #8478: The hash returned by Enumerable#group_by should have an
empty array for its default value
https://bugs.ruby-lang.org/issues/8478#change-39655

Author: phiggins (Pete Higgins)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:


Without this patch, nil checks might need to be done on the return value
of Enumerable#group_by:

$ cat test_group_by.rb
a = [1, 2, 3, "a", "b"]
g = a.group_by {|o| o.class }

puts "Fixnums: #{g[Fixnum].size}"
puts "Strings: #{g[String].size}"
puts "Arrays: #{g[Array].size}"

$ ruby test_group_by.rb
Fixnums: 3
Strings: 2
test_group_by.rb:6:in `<main>': undefined method `size' for nil:NilClass
(NoMethodError)

This patch adds a default value of an empty array to the hash returned
by Enumerable#group_by, so the script above will work:

$ ./ruby -I.:lib test_group_by.rb
Fixnums: 3
Strings: 2
Arrays: 0
Df9d48b6e2ae6cdedf4c39c2e58df851?d=identicon&s=25 boris_stitnicky (Boris Stitnicky) (Guest)
on 2013-06-03 05:19
(Received via mailing list)
Issue #8478 has been updated by boris_stitnicky (Boris Stitnicky).


@marcandre: What do you mean by incompatibility? That the behavior is
not the same as before, or something deeper?
----------------------------------------
Feature #8478: The hash returned by Enumerable#group_by should have an
empty array for its default value
https://bugs.ruby-lang.org/issues/8478#change-39660

Author: phiggins (Pete Higgins)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:


Without this patch, nil checks might need to be done on the return value
of Enumerable#group_by:

$ cat test_group_by.rb
a = [1, 2, 3, "a", "b"]
g = a.group_by {|o| o.class }

puts "Fixnums: #{g[Fixnum].size}"
puts "Strings: #{g[String].size}"
puts "Arrays: #{g[Array].size}"

$ ruby test_group_by.rb
Fixnums: 3
Strings: 2
test_group_by.rb:6:in `<main>': undefined method `size' for nil:NilClass
(NoMethodError)

This patch adds a default value of an empty array to the hash returned
by Enumerable#group_by, so the script above will work:

$ ./ruby -I.:lib test_group_by.rb
Fixnums: 3
Strings: 2
Arrays: 0
0e610136db92027148906c92d57fdb36?d=identicon&s=25 marcandre (Marc-Andre Lafortune) (Guest)
on 2013-06-03 06:41
(Received via mailing list)
Issue #8478 has been updated by marcandre (Marc-Andre Lafortune).

Assignee set to matz (Yukihiro Matsumoto)

Right, that the behavior is changing and that this could break existing
code. It might have been an interesting idea at the time `group_by` was
introduced, but I doubt that Matz will feel the change is compelling
enough to break compatibility today.
----------------------------------------
Feature #8478: The hash returned by Enumerable#group_by should have an
empty array for its default value
https://bugs.ruby-lang.org/issues/8478#change-39661

Author: phiggins (Pete Higgins)
Status: Open
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category:
Target version:


Without this patch, nil checks might need to be done on the return value
of Enumerable#group_by:

$ cat test_group_by.rb
a = [1, 2, 3, "a", "b"]
g = a.group_by {|o| o.class }

puts "Fixnums: #{g[Fixnum].size}"
puts "Strings: #{g[String].size}"
puts "Arrays: #{g[Array].size}"

$ ruby test_group_by.rb
Fixnums: 3
Strings: 2
test_group_by.rb:6:in `<main>': undefined method `size' for nil:NilClass
(NoMethodError)

This patch adds a default value of an empty array to the hash returned
by Enumerable#group_by, so the script above will work:

$ ./ruby -I.:lib test_group_by.rb
Fixnums: 3
Strings: 2
Arrays: 0
F52e87b92cafb1e8c6d155076b56ecff?d=identicon&s=25 "duerst (Martin Dürst)" <duerst@it.aoyama.ac.jp> (Guest)
on 2013-06-03 08:57
(Received via mailing list)
Issue #8478 has been updated by duerst (Martin Dürst).


I don't think I agree with the proposer. The example looks good, but
what about something like:

a = [1, 2, 3, "a", "b"]
g = a.group_by {|o| o.class }

puts "Fixnums: #{g[Fixnum].size}"
puts "Strings: #{g[String].size}"
puts "Zeroes: #{g[0].size}"

In other words, it might be desirable to return an empty array for a
group key that is part of the collection that we group over (classes in
the example), but it'd be better to cause an error for group keys that
are outside of the collection (i.e., not classes in the example).

That would mean that the default proc should (for this example) be
something like
    {|hash, key| hash[key] = [] if key.class==Class }
Of course, that cannot be part of Ruby, unless maybe as a third argument
of some form to group_by.


----------------------------------------
Feature #8478: The hash returned by Enumerable#group_by should have an
empty array for its default value
https://bugs.ruby-lang.org/issues/8478#change-39663

Author: phiggins (Pete Higgins)
Status: Open
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category:
Target version:


Without this patch, nil checks might need to be done on the return value
of Enumerable#group_by:

$ cat test_group_by.rb
a = [1, 2, 3, "a", "b"]
g = a.group_by {|o| o.class }

puts "Fixnums: #{g[Fixnum].size}"
puts "Strings: #{g[String].size}"
puts "Arrays: #{g[Array].size}"

$ ruby test_group_by.rb
Fixnums: 3
Strings: 2
test_group_by.rb:6:in `<main>': undefined method `size' for nil:NilClass
(NoMethodError)

This patch adds a default value of an empty array to the hash returned
by Enumerable#group_by, so the script above will work:

$ ./ruby -I.:lib test_group_by.rb
Fixnums: 3
Strings: 2
Arrays: 0
419b54e4b0c805f8ed671451ea536e19?d=identicon&s=25 Yura Sokolov (funny_falcon)
on 2013-06-03 11:30
(Received via mailing list)
Issue #8478 has been updated by funny_falcon (Yura Sokolov).


There are always different ways to work with hash:

    counters = Hash.new{|h,k| 0}
    list.each{|e| counters[e]+=1}
    some_thing_count = counters[some_thing]

compared to

    counters = {}
    list.each{|e| counters[e] = (counters[e]||0) + 1}
    raise "THERE IS NO SOME-THING" unless counters[some_thing]

So that, group_by should not assume user's needs

So, one could assign default proc if he really wants:

    a = [1, 2, 3, "a", "b"]
    g = a.group_by {|o| o.class }
    g.default_proc = proc{|hash, key| hash[key] = [] if key.class==Class
}

    puts "Fixnums: #{g[Fixnum].size}"
    puts "Strings: #{g[String].size}"
    puts "Arrays: #{g[Array].size}"
    if g[0]
      puts "Zeroes: #{g[0].size}"
    else
      raise "NO ZEROES!!!!"
    end
----------------------------------------
Feature #8478: The hash returned by Enumerable#group_by should have an
empty array for its default value
https://bugs.ruby-lang.org/issues/8478#change-39668

Author: phiggins (Pete Higgins)
Status: Open
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category:
Target version:


Without this patch, nil checks might need to be done on the return value
of Enumerable#group_by:

$ cat test_group_by.rb
a = [1, 2, 3, "a", "b"]
g = a.group_by {|o| o.class }

puts "Fixnums: #{g[Fixnum].size}"
puts "Strings: #{g[String].size}"
puts "Arrays: #{g[Array].size}"

$ ruby test_group_by.rb
Fixnums: 3
Strings: 2
test_group_by.rb:6:in `<main>': undefined method `size' for nil:NilClass
(NoMethodError)

This patch adds a default value of an empty array to the hash returned
by Enumerable#group_by, so the script above will work:

$ ./ruby -I.:lib test_group_by.rb
Fixnums: 3
Strings: 2
Arrays: 0
0d8600ea9d02baf8c84503b9410598ac?d=identicon&s=25 Pete Higgins (phiggy)
on 2013-06-04 08:51
(Received via mailing list)
Issue #8478 has been updated by phiggins (Pete Higgins).


marcandre (Marc-Andre Lafortune) wrote:

> First of, it's a really bad idea to set the default of a hash to a mutable
object:

Yes, I didn't consider the problem of returning the same array for every
key, and that would have to be changed if this were to be seriously
considered.

> Assuming now that your request was instead to change the default proc to {|hash,
key| hash[key] = []}, which would not cause the problem above, you still have 
the
problem of incompatibility.

It seems like some appropriate milestone like 2.1 might be an acceptable
place to make such an incompatible change.
----------------------------------------
Feature #8478: The hash returned by Enumerable#group_by should have an
empty array for its default value
https://bugs.ruby-lang.org/issues/8478#change-39687

Author: phiggins (Pete Higgins)
Status: Open
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category:
Target version:


Without this patch, nil checks might need to be done on the return value
of Enumerable#group_by:

$ cat test_group_by.rb
a = [1, 2, 3, "a", "b"]
g = a.group_by {|o| o.class }

puts "Fixnums: #{g[Fixnum].size}"
puts "Strings: #{g[String].size}"
puts "Arrays: #{g[Array].size}"

$ ruby test_group_by.rb
Fixnums: 3
Strings: 2
test_group_by.rb:6:in `<main>': undefined method `size' for nil:NilClass
(NoMethodError)

This patch adds a default value of an empty array to the hash returned
by Enumerable#group_by, so the script above will work:

$ ./ruby -I.:lib test_group_by.rb
Fixnums: 3
Strings: 2
Arrays: 0
0d8600ea9d02baf8c84503b9410598ac?d=identicon&s=25 Pete Higgins (phiggy)
on 2013-06-04 08:56
(Received via mailing list)
Issue #8478 has been updated by phiggins (Pete Higgins).


duerst (Martin Dürst) wrote:
>
> That would mean that the default proc should (for this example) be something
like
>     {|hash, key| hash[key] = [] if key.class==Class }
> Of course, that cannot be part of Ruby, unless maybe as a third argument of some
form to group_by.

I certainly didn't intend for the default value to have some idea of the
type of comparison done by the call to group_by, that seems to go well
outside the scope of group_by and just general ruby semantics. It seems
counter intuitive to me that while the point of the method is to return
a hash of arrays, the return value would produce either an array with
some values in it or nil. It returning a hash that produced arrays with
some values in them or an empty arrays means you're only dealing with
one type of thing, an array.
----------------------------------------
Feature #8478: The hash returned by Enumerable#group_by should have an
empty array for its default value
https://bugs.ruby-lang.org/issues/8478#change-39688

Author: phiggins (Pete Higgins)
Status: Open
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category:
Target version:


Without this patch, nil checks might need to be done on the return value
of Enumerable#group_by:

$ cat test_group_by.rb
a = [1, 2, 3, "a", "b"]
g = a.group_by {|o| o.class }

puts "Fixnums: #{g[Fixnum].size}"
puts "Strings: #{g[String].size}"
puts "Arrays: #{g[Array].size}"

$ ruby test_group_by.rb
Fixnums: 3
Strings: 2
test_group_by.rb:6:in `<main>': undefined method `size' for nil:NilClass
(NoMethodError)

This patch adds a default value of an empty array to the hash returned
by Enumerable#group_by, so the script above will work:

$ ./ruby -I.:lib test_group_by.rb
Fixnums: 3
Strings: 2
Arrays: 0
906b670a0917ec10217b5c6c926ad5c4?d=identicon&s=25 Michael Westbom (totallymike)
on 2013-09-27 00:19
I disagree with the modification.  An empty array is truthy, and that
can cause surprises.

Surprises are bad.

If you absolutely need an empty array to come of non-existent keys, use
g.fetch(Array, [])
This topic is locked and can not be replied to.