# Enumerable#categorize

\$B1sF#\$G\$9!#(B

2010\$BG/(B11\$B7n(B27\$BF|(B18:45 Tanaka A. [email protected]:

enumerable \$B\$+\$i(B hash [email protected]@.\$9\$k%a%=%C%I\$H\$7\$F(B
Enumerable#categorize \$B\$rDI2C\$9\$k\$N\$O\$I\$&\$G\$7\$g\$&\$+!#(B

\$B\$_\$s\$J\$HF1\$846A[\$G\$9\$,!"\$d\$O\$j(B 1
\$B%a%=%C%I\$K5M\$a9~\$_2a\$.\$F\$\$\$k\$H;W\$\$\$^\$9!#(B
\$B!VMWAG\$NCj=P!W!VJ,N`=87W=hM}!W!V8e=hM}!W\$N\$&\$A!"(B2
\$BHVL\$K9J\$C\$?\$i\$\$\$\$\$N\$G\$O(B
\$B\$J\$\$\$G\$7\$g\$&\$+!#(B

[[1, 2], [1, 3], [2, 3]].aggregate #=> {1 => [2, 3], 2 =>
[3]}
[[1, 2], [1, 3], [2, 3]].aggregate(:op=>:+) #=> {1 => 5, 2 => 3 }
[[1, 2, 3]].aggregate #=> {1 => {2 => [3]} }

# \$BB>\$N8uJd\$H\$7\$F\$O!"(Bto_hash \$B!"(Bhashtree \$B\$H\$+\$I\$&\$G\$7\$g\$&!#(B

\$B%3!<%J!<%1!<%9\$O\$3\$s\$J46\$8\$G\$7\$g\$&\$+!#(B

[[1, 2], [1, 2, 3]].aggregate #=> \$BNc30(B
[].aggregate #=> {}
[[]].aggregate #=> \$BNc30(B
[[1]].aggregate #=> \$BNc30(B

\$BB?>/(B categorize
\$B\$h\$jD9\$/\$J\$j\$^\$9\$,!“2?\$r\$d\$C\$F\$\$\$k\$+\$O\$:\$C\$H\$o\$+\$j\$d\$9\$/(B
\$B\$J\$k\$H;W\$\$\$^\$9!#(B
\$B0J2<!”(Bakr \$B\$5\$s\$NNc\$r=q\$-49\$(\$F\$_\$^\$7\$?!#(B

ary = [[“matz”, “Yukihiro M.”],
[“akr”, “Tanaka A.”],
[“usa”, “Usaku NAKAMURA”],
[“naruse”, “NARUSE, Yui”],

# h = ary.categorize(1, 0)

h = ary.map {|k, v| [v, k] }.aggregate

#=> {“Yukihiro M.”=>[“matz”],

# ary.categorize(lambda {|elt| elt[1] }, lambda {|elt| elt[0] })

h = ary.map {|elt| [elt[1], elt[0]] }.aggregate

#=> {“Yukihiro M.”=>[“matz”],

# h = ary.categorize(lambda {|e| e[0][0] }, lambda {|e| e[0][1]}, 0)

h = ary.map {|e| [e[0][0], e[0][1], e[0]] }.aggregate

#=> {“m”=>{“a”=>[“matz”]},

# h = ary.categorize(lambda {|e| e[0][0] }, 1) {|ks, vs| vs.sort }

h = ary.map {|elt| [elt[0][0], elt[1]] }.aggregate
h.each {|ks, vs| h[ks] = vs.sort } # \$B\$^\$?\$O(B vs.sort!

#=> {“m”=>[“Yukihiro M.”],

# }

h = ary.map {|k, v| [v, k] }.aggregate
h.each {|ks, vs|
raise “duplicate keys: #{ks.inspcet}” if vs.length != 1
h[ks] = vs[0]
}

#=> {“Yukihiro M.”=>“matz”,

# h = ary.categorize(lambda {|e| e[0][0] }, lambda {|e| 1 }, :op=>:+)

h = ary.map {|e| [e[0][0], 1] }.aggregate(:op=>:+)

#=> {“m”=>1, “n”=>2, “a”=>1, “u”=>1, “k”=>1}

# pp committers.categorize(“account”, [“name”, “nick”]) {|ks, vs|

vs[0] }

h = committers.map {|e| [e[“account”], [e[“name”], e[“nick”]]]
}.aggregate
h.each {|ks, vs| h[ks] = vs[0] }

\$BCf4VG[Ns\$,\$G\$-\$k\$N\$,7y!"[email protected]\$o\$l\$k\$H;W\$\$\$^\$9\$,!"\$=\$l\$O(B categorize
\$BFCM-\$N(B
\$BLdBj\$G\$O\$J\$\$\$N\$G!"(Bmap \$B\$N(B Enumerator
\$B\$rJV\$9%P!<%8%g%s\$rMQ0U\$9\$k\$3\$H\$G2r7h(B
[email protected]\$H;W\$\$\$^\$9!#(B

# \$B<jA0L#A9\$G\$9\$,(B http://d.hatena.ne.jp/ku-ma-me/20091111/p2

\$B\$^\$?!"(BHash \$B\$+\$i(B Hash \$B\$rJV\$9(B map
\$B\$b\$"\$k\$HJXMx\$+\$b\$7\$l\$^\$;\$s!#(B
\$B\$7\$P\$7\$PMWK>\$-\$F\$k\$H;W\$\$\$^\$9\$,!"\$J\$s\$GMQ0U\$5\$l\$J\$\$\$s\$G\$7\$?\$C\$1!#(B

group_by \$B\$H\$NN`;w\$K\$D\$\$\$F\$O!“8D?ME*\$K\$O\$”\$^\$j5\$\$K\$J\$j\$^\$;\$s!#(B
\$B\$H\$\$\$&\$+!"(Bgroup_by
\$B\$,\$I\$&\$K\$b;H\$\$\$K\$/\$9\$.\$k\$N\$,[email protected]\$H;W\$\$\$^\$9!#(B

\$B\$"\$H!"(B:seed \$B\$OCM\$G\$O\$J\$/[email protected]@.\$9\$k(B Proc
\$B\$r<u\$1<[email protected]\$H;W\$\$\$^\$9!#(B
Hash.new([]) [email protected]_7WITNI\$K\$J\$C\$F\$7\$^\$\$\$^\$9!#(B

p [[1, 2], [1, 3], [2, 5]].categorize(0, 1, seed: [], op: proc {|x,
e| x << e })
{1=>[2, 3, 5], 2=>[2, 3, 5]}

\$B1sF#\$G\$9!#(B

2010\$BG/(B12\$B7n(B6\$BF|(B22:58 Tanaka A. [email protected]:

2010\$BG/(B12\$B7n(B6\$BF|(B20:22 Yusuke ENDOH [email protected]:

\$B\$_\$s\$J\$HF1\$846A[\$G\$9\$,!"\$d\$O\$j(B 1
\$B%a%=%C%I\$K5M\$a9~\$_2a\$.\$F\$\$\$k\$H;W\$\$\$^\$9!#(B

\$B!VMWAG\$NCj=P!W!VJ,N`=87W=hM}!W!V8e=hM}!W\$N\$&\$A!"(B2
\$BHVL\$K9J\$C\$?\$i\$\$\$\$\$N\$G\$O(B

\$B\$J\$\$\$G\$7\$g\$&\$+!#(B

\$B\$H\$j\$"\$(\$:!"\$^\$D\$b\$H\$5\$s\$N%"%\$%G%"\$K\$h\$C\$F!";EMM\$OJQ2=\$7\$F\$\$\$k\$N\$G!"(B
\$B0U8+\$O(B [ruby-dev:42659] [email protected]\$1\$k\$H\$"\$j\$,\$?\$\$\$G\$9!#(B
\$B\$=\$3\$G\$O!V8e=hM}!W\$O>C\$(\$F\$\$\$^\$9!#(B

\$B\$`\$&!"\$9\$_\$^\$;\$s!#5DO@\$r\$A\$c\$s\$HDI\$(\$F\$^\$;\$s\$G\$7\$?!#(B

\$B\$J\$*!VMWAG\$NCj=P!W\$,2?\$r\$5\$9\$N\$+\$OJ,\$+\$j\$^\$;\$s\$G\$7\$?!#(B

[*keys, value] \$B\$JG[Ns\$r:n\$k\$H\$3\$m(B ([elt[1], elt[0]] \$B\$H\$+(B)
\$B\$G\$9!#(B
\$B?7(B API \$B\$J\$iLdBj\$J\$\$\$+\$J\$H\$\$\$&5\$\$b\$7\$F\$-\$^\$7\$?!#(B

[email protected]\$+!"(Btrie \$B:[email protected]%a%=%C%I\$\$?\$\$\$G\$9\$M!#(B
\$B%Q%C%A\$r;n\$7\$F\$
\$^\$7\$?\$,!"0J2<\$G(B SEGV \$B\$7\$^\$7\$?!#(B

p [[1, 2], [1, 2, 3]].categorize {|e| e }

\$B\$3\$l\$G2?\$,5"\$C\$FMh\$k\$Y\$-\$+G:\$^\$7\$\$!#(B
\$B8D?ME*\$K\$O!"D9\$5\$,0c\$&>l9g\$ONc30\$K\$9\$k\$N\$,\$o\$+\$j\$d\$9\$\$\$+\$J\$H;W\$\$\$^\$9!#(B

2010\$BG/(B12\$B7n(B6\$BF|(B20:22 Yusuke ENDOH [email protected]:

\$B\$_\$s\$J\$HF1\$846A[\$G\$9\$,!"\$d\$O\$j(B 1
\$B%a%=%C%I\$K5M\$a9~\$_2a\$.\$F\$\$\$k\$H;W\$\$\$^\$9!#(B
\$B!VMWAG\$NCj=P!W!VJ,N`=87W=hM}!W!V8e=hM}!W\$N\$&\$A!"(B2
\$BHVL\$K9J\$C\$?\$i\$\$\$\$\$N\$G\$O(B
\$B\$J\$\$\$G\$7\$g\$&\$+!#(B

\$B\$H\$j\$"\$(\$:!"\$^\$D\$b\$H\$5\$s\$N%"%\$%G%"\$K\$h\$C\$F!";EMM\$OJQ2=\$7\$F\$\$\$k\$N\$G!"(B
\$B0U8+\$O(B [ruby-dev:42659]
[email protected]\$1\$k\$H\$"\$j\$,\$?\$\$\$G\$9!#(B
\$B\$=\$3\$G\$O!V8e=hM}!W\$O>C\$(\$F\$\$\$^\$9!#(B

\$B\$J\$*!VMWAG\$NCj=P!W\$,2?\$r\$5\$9\$N\$+\$OJ,\$+\$j\$^\$;\$s\$G\$7\$?!#(B

2010\$BG/(B12\$B7n(B7\$BF|(B0:56 Yusuke ENDOH [email protected]:

\$B%Q%C%A\$r;n\$7\$F\$_\$^\$7\$?\$,!"0J2<\$G(B SEGV \$B\$7\$^\$7\$?!#(B

p [[1, 2], [1, 2, 3]].categorize {|e| e }

\$B\$3\$l\$G2?\$,5"\$C\$FMh\$k\$Y\$-\$+G:\$^\$7\$\$!#(B
\$B8D?ME*\$K\$O!"D9\$5\$,0c\$&>l9g\$ONc30\$K\$9\$k\$N\$,\$o\$+\$j\$d\$9\$\$\$+\$J\$H;W\$\$\$^\$9!#(B

\$B\$*\$C\$H!"=hCV\$rK:\$l\$F\$\$\$^\$7\$?!#(B

# Index: enum.c

— enum.c (revision 30062)
+++ enum.c (working copy)
@@ -15,7 +15,7 @@
#include “id.h”

-static ID id_next;
+static ID id_next, id_call, id_seed, id_op, id_update;
#define id_each idEach
#define id_eqq idEqq
#define id_cmp idCmp
@@ -2595,6 +2595,162 @@ enum_slice_before(int argc, VALUE *argv,
return enumerator;
}

+struct categorize_arg {

• VALUE seed;
• VALUE op;
• VALUE update;
• VALUE result;
+};

+static VALUE
+categorize_update(struct categorize_arg *argp, VALUE ary, VALUE acc,
VALUE val)
+{

• if (argp->op != Qundef) {
• ``````   if (SYMBOL_P(argp->op))
``````
• ``````       return rb_funcall(acc, SYM2ID(argp->op), 1, val);
``````
• ``````   else
``````
• ``````       return rb_funcall(argp->op, id_call, 2, acc, val);
``````
• }
• else if (argp->update != Qundef) {
• ``````   if (SYMBOL_P(argp->update))
``````
• ``````       return rb_funcall(acc, SYM2ID(argp->update), 1, ary);
``````
• ``````   else
``````
• ``````       return rb_funcall(argp->update, id_call, 2, acc, ary);
``````
• }
• else {
• ``````   if (NIL_P(acc))
``````
• ``````       return rb_ary_new3(1, val);
``````
• ``````   else {
``````
• ``````       Check_Type(acc, T_ARRAY);
``````
• ``````       rb_ary_push(acc, val);
``````
• ``````       return acc;
``````
• ``````   }
``````
• }
+}

+static VALUE
+categorize_i(VALUE i, VALUE _arg, int argc, VALUE *argv)
+{

• struct categorize_arg *argp;
• VALUE ary, h;
• VALUE lastk, val, acc;
• long j;
• ENUM_WANT_SVALUE();
• argp = (struct categorize_arg *)_arg;
• ary = rb_yield(i);
• ary = rb_convert_type(ary, T_ARRAY, “Array”, “to_ary”);
• if (RARRAY_LEN(ary) < 2) {
• ``````   rb_raise(rb_eArgError, "array too short");
``````
• }
• lastk = RARRAY_PTR(ary)[RARRAY_LEN(ary)-2];
• val = RARRAY_PTR(ary)[RARRAY_LEN(ary)-1];
• h = argp->result;
• for (j = 0; j < RARRAY_LEN(ary) - 2; j++) {
• ``````   VALUE k = RARRAY_PTR(ary)[j];
``````
• ``````   VALUE h2;
``````
• ``````   h2 = rb_hash_lookup2(h, k, Qundef);
``````
• ``````   if (h2 == Qundef) {
``````
• ``````       h2 = rb_hash_new();
``````
• ``````       rb_hash_aset(h, k, h2);
``````
• ``````   }
``````
• ``````   else {
``````
• ``````       Check_Type(h2, T_HASH);
``````
• ``````   }
``````
• ``````   h = h2;
``````
• }
• acc = rb_hash_lookup2(h, lastk, Qundef);
• if (acc == Qundef) {
• ``````   if (argp->seed == Qundef)
``````
• ``````       acc = val;
``````
• ``````   else
``````
• ``````       acc = categorize_update(argp, ary, argp->seed, val);
``````
• }
• else {
• ``````   acc = categorize_update(argp, ary, acc, val);
``````
• }
• rb_hash_aset(h, lastk, acc);
• return Qnil;
+}

+/*

• call-seq:
• enum.categorize([opts]) {|elt| [key1, …, val] } -> hash
• categorizes the elements in enum and returns a hash.
• The block is called for each elements in enum.
• The block should return an array which contains
• one or more keys and one value.
• The keys and value are used to construct the result hash.
• If two or more keys are provided
• (i.e. the length of the array is longer than 2),
• the result hash will be nested.
• The value of innermost hash is an array which contains values for
• corresponding keys.
• (This behavior can be customized by :seed, :op and :update option.)
• a = [{:fruit => “banana”, :color => “yellow”, :taste => “sweet”},
• ``````   {:fruit => "melon", :color => "green", :taste => "sweet"},
``````
• ``````   {:fruit => "grapefruit", :color => "yellow", :taste =>
``````

“tart”}]

• p a.categorize {|h| h.values_at(:color, :fruit) }
• #=> {“yellow”=>[“banana”, “grapefruit”], “green”=>[“melon”]}
• pp a.categorize {|h| h.values_at(:taste, :color, :fruit) }
• #=> {“sweet”=>{“yellow”=>[“banana”], “green”=>[“melon”]},
• # “tart”=>{“yellow”=>[“grapefruit”]}}

• This method can take an option hash.
• Available options are follows:
• :seed specifies seed value.
• :op specifies a procedure from seed and value to next seed.
• :update specifies a procedure from seed and block value to next
seed.
• The default behavior, array construction, can be implemented as
follows.
• :seed => nil
• :op => lambda {|s, v| !s ? [v] : (s << v) }
• */
+static VALUE
+enum_categorize(int argc, VALUE *argv, VALUE enumerable)
+{
• VALUE opts;
• struct categorize_arg arg;
• RETURN_ENUMERATOR(enumerable, 0, 0);
• rb_scan_args(argc, argv, “0:”, &opts);
• if (NIL_P(opts)) {
• ``````   arg.seed = Qnil;
``````
• ``````   arg.op = Qundef;
``````
• ``````   arg.update = Qundef;
``````
• }
• else {
• ``````   arg.seed = rb_hash_lookup2(opts, ID2SYM(id_seed), Qundef);
``````
• ``````   arg.op = rb_hash_lookup2(opts, ID2SYM(id_op), Qundef);
``````
• ``````   arg.update = rb_hash_lookup2(opts, ID2SYM(id_update), Qundef);
``````
• ``````   if (arg.op != Qundef && arg.update != Qundef) {
``````
• ``````       rb_raise(rb_eArgError, "both :update and :op specified");
``````
• ``````   }
``````
• ``````   if (arg.op != Qundef && !SYMBOL_P(arg.op))
``````
• ``````       arg.op = rb_convert_type(arg.op, T_DATA, "Proc",
``````

“to_proc”);

• ``````   if (arg.update != Qundef && !SYMBOL_P(arg.update))
``````
• ``````       arg.update = rb_convert_type(arg.update, T_DATA, "Proc",
``````

“to_proc”);

• }
• arg.result = rb_hash_new();
• rb_block_call(enumerable, id_each, 0, 0, categorize_i,
(VALUE)&arg);
• return arg.result;
+}

/*

• The `Enumerable` mixin provides collection classes with
• several traversal and searching methods, and with the ability to
@@ -2662,6 +2818,11 @@ Init_Enumerable(void)
-1);
-1);

id_next = rb_intern(“next”);

• id_call = rb_intern(“call”);

• id_seed = rb_intern(“seed”);

• id_op = rb_intern(“op”);

• id_update = rb_intern(“update”);
}
Index: test/ruby/test_enum.rb
===================================================================
— test/ruby/test_enum.rb (revision 30062)
+++ test/ruby/test_enum.rb (working copy)
@@ -384,4 +384,33 @@ class TestEnumerable < Test::Unit::TestC
ss.slice_before(/\A…\z/).to_a)
end

• def test_categorize

• assert_equal((1…6).group_by {|i| i % 3 },

• ``````            (1..6).categorize {|e| [e % 3, e] })
``````
• assert_equal(Hash[ [ [“a”, 100], [“b”, 200] ] ],

• ``````            [ ["a", 100], ["b", 200] ].categorize(:op=>lambda
``````

{|x,y| y }) {|e| e })

• h = { “n” => 100, “m” => 100, “y” => 300, “d” => 200, “a” => 0 }
• assert_equal(h.invert,
• ``````            h.categorize(:op=>lambda {|x,y| y }) {|k, v| [v, k] })
``````
• assert_equal({“f”=>1, “o”=>2, “b”=>2, “a”=>2, “r”=>1, “z”=>1},
• ``````            "foobarbaz".split(//).categorize(:op=>:+) {|ch| [ch,
``````

1] })

• assert_equal({“f”=>1, “o”=>2, “b”=>2, “a”=>2, “r”=>1, “z”=>1},
• ``````            "foobarbaz".split(//).categorize(:update=>lambda
``````

{|s, a| s + a.last }) {|ch| [ch, 1] })

• assert_equal({“f”=>[“f”, 1],
• ``````             "o"=>["o", 1, "o", 1],
``````
• ``````             "b"=>["b", 1, "b", 1],
``````
• ``````             "a"=>["a", 1, "a", 1],
``````
• ``````             "r"=>["r", 1],
``````
• ``````             "z"=>["z", 1]},
``````
• ``````            "foobarbaz".split(//).categorize(:seed=>[],
``````

:update=>:+) {|ch| [ch, 1] })

• assert_raise(ArgumentError) { [0].categorize {|e| [] } }
• assert_raise(ArgumentError) { [0].categorize {|e| [1] } }
• assert_equal(
• `````` {"f"=>{"o"=>{"o"=>{:c=>1}}},
``````
• ``````  "b"=>{"a"=>{"r"=>{:c=>1},
``````
• ``````              "z"=>{:c=>1}}}},
``````
• `````` %w[foo bar baz].categorize(:op=>:+) {|s| s.split(//) + [:c, 1] })
``````
• #assert_raise(TypeError) { [[1, 2], [1, 2, 3]].categorize {|e| e }
}
• end

end

\$B\$1\$\$\$8\$e!w\$\$\$7\$D\$+\$G\$9(B.

ruby-talk\$B\$G\$O\$9\$G\$K9pCN\$5\$l\$F\$\$\$k\$N\$G\$9\$,(B,

In [ruby-dev:42698] the message: “[ruby-dev:42698] Re:
Enumerable#categorize”, on Dec/06 22:58(JST) Tanaka A. writes:

\$B\$H\$j\$"\$(\$:!"\$^\$D\$b\$H\$5\$s\$N%"%\$%G%"\$K\$h\$C\$F!";EMM\$OJQ2=\$7\$F\$\$\$k\$N\$G!"(B
\$B0U8+\$O(B [ruby-dev:42659] [email protected]\$1\$k\$H\$"\$j\$,\$?\$\$\$G\$9!#(B
\$B\$=\$3\$G\$O!V8e=hM}!W\$O>C\$(\$F\$\$\$^\$9!#(B

\$B8e=hM}\$C\$]\$\$\$N\$"\$j\$^\$9\$h\$M(B? :op\$B\$r;XDj\$9\$k\$H\$G\$9\$,(B…

fairy\$B\$G\$O(B, inject_by\$B\$H\$\$\$&\$N\$rF3F~\$7\$?\$\$\$H9M\$(\$F\$\$\$k\$N\$G\$9\$,(B,
\$B\$3\$l\$,(B, :
op\$BIU\$N(Bcategoraize\$B\$K5!G=E*\$K\$O;w\$F\$\$\$k\$N\$+\$b(B? \$B\$H;W\$\$\$^\$7\$?(B.

fairy\$B\$N(Binject_by\$B\$O(B, Ruby\$B\$KCV\$-49\$(\$k\$H(B, group_by\$B\$7\$F(B,
\$B\$=\$N%0%k!<%W\$4\$H(B
\$B\$K(Binject\$B\$9\$k\$b\$N\$G\$9(B.
inject_by\$B\$N(B’by’\$B\$NItJ,\$K%0%k!<%WKh\$K\$H\$\$\$&0UL#\$r(B
\$B9~\$a\$?L>A0\$G\$9(B.

categoraize\$B\$H\$N0c\$\$\$O(B, :op\$BAjEv\$,I,?\$H\$\$\$&\$3\$H\$H(B,
\$BF~NO\$N(Benumerable\$B\$NMW(B
\$BAG\$rG[Ns\$rA0Ds\$H\$7\$F\$\$\$J\$\$\$H\$3\$m\$G\$9(B.

\$B\$3\$&\$\$\$C\$?\$b\$N\$rF3F~\$7\$?\$\$\$H;W\$C\$?7P0^\$O(B, MapReduce\$B\$G\$O(B,
map-shuffle-reduce \$B\$H%U%’!<%:\$,\$"\$k\$o\$1\$G\$9\$,(B,
shuffle-reduce\$BItJ,\$O(B,
\$B8D!9\$N%Q!<%F%#%7%g%s\$4\$H\$K(Breduce\$B=hM}\$9\$k\$3\$H\$,B?\$/(B,
\$B\$=\$l\$C\$F(B, \$B\$^\$5\$K(B
[email protected]\$H9M\$(\$?\$3\$H\$H(B,
MapReduce\$B\$G\$O:GE,2=\$N\$?\$a\$KA0=hM}\$H\$7\$F(B
combiner\$B\$rMQ\$\$\$k\$3\$H\$,B?\$\$\$N\$G\$9\$,(B, inject_by\$B\$rMQ\$\$\$k\$H(B,
\$B\$"\$^[email protected]}E*(B
\$B\$K\$OH~\$7\$/\$J\$\$(Bcombiner\$B\$rI=\$K=P\$9\$3\$H\$J\$/:GE,2=\$,<B8=\$G\$-\$k\$+\$i\$G\$9(B.

__
---------------------------------------------------->> [email protected](B \$B7=<y(B
<<—
---------------------------------->> e-mail: [email protected] <<—

2010\$BG/(B12\$B7n(B7\$BF|(B0:00 [email protected]=<y(B [email protected]:

\$B8e=hM}\$C\$]\$\$\$N\$"\$j\$^\$9\$h\$M(B? :op\$B\$r;XDj\$9\$k\$H\$G\$9\$,(B…

\$B;d\$N0U?^\$9\$k!V8e=hM}!W\$O!":G8e\$K0l2s\$d\$k=hM}\$G\$9!#(B
:op (\$B\$d(B :update) \$B\$OKh2s\$d\$k\$N\$G0c\$\$\$^\$9!#(B

\$B6qBNE*\$K\$O!"(BSQL \$B\$N(B AVG \$B\$r9M\$(\$F\$/[email protected]\$5\$\$!#(B
AVG \$B\$OJ?6Q\$r5a\$a\$k\$o\$1\$G\$9\$,!"\$d\$j\$+\$?\$H\$7\$F\$?\$H\$(\$P!"(B
\$BMWAG\$,\$R\$H\$D8+\$D\$+\$k\$?\$S\$K9g7W\$H8D?t\$r99?7\$7\$F\$\$\$-!"(B
\$B:G8e\$K3d\$j;;\$7\$FJ?6Q\$r5a\$a\$k!"\$H\$\$\$&\$b\$N\$,\$"\$k\$G\$7\$g\$&!#(B

\$B\$3\$N!":G8e\$N3d\$j;;\$,!“8e=hM}\$NNc\$G\$9!#(B
(\$B\$b\$A\$m\$s!“9g7W\$d8D?t\$r8e=hM}\$G5a\$a\$F\$b\$\$\$\$\$N\$G\$9\$,!”(B
\$B\$=\$l\$O8e=hM}\$G\$J\$/!”(B:op \$B\$G\$b<B8=\$G\$-\$^\$9(B)

\$B;d\$N:G=i\$N0F\$G\$OJ,N`\$7\$?%+%F%4%jKh\$NJ?6Q\$r5a\$a\$k\$3\$H\$,2DG=\$G\$9\$,!"(B
\$B8=:_\$N0F\$G\$O\$G\$-\$J\$\$\$H;W\$\$\$^\$9!#(B

2010\$BG/(B12\$B7n(B6\$BF|(B20:22 Yusuke ENDOH [email protected]:

\$B\$^\$?!"(BHash \$B\$+\$i(B Hash \$B\$rJV\$9(B map \$B\$b\$"\$k\$HJXMx\$+\$b\$7\$l\$^\$;\$s!#(B
\$B\$7\$P\$7\$PMWK>\$-\$F\$k\$H;W\$\$\$^\$9\$,!"\$J\$s\$GMQ0U\$5\$l\$J\$\$\$s\$G\$7\$?\$C\$1!#(B

Hash[enum.map {|e| [k, v] }] \$BAjEv\$J\$i!"\$A\$g\$&\$I\$^\$D\$b\$H\$5\$s\$,(B
[ruby-dev:42643] [ruby-dev:42645] \$B\$GMWK>\$7\$F\$\$\$^\$9\$M!#(B
\$BNI\$\$L>A0\$,8+\$D\$+\$C\$F\$\$\$^\$;\$s\$,!#(B

\$B\$"\$H!"(B:seed \$B\$OCM\$G\$O\$J\$/[email protected]@.\$9\$k(B Proc
\$B\$r<u\$1<[email protected]\$H;W\$\$\$^\$9!#(B
Hash.new([]) [email protected]_7WITNI\$K\$J\$C\$F\$7\$^\$\$\$^\$9!#(B

p [[1, 2], [1, 3], [2, 5]].categorize(0, 1, seed: [], op: proc {|x,
e| x << e })
{1=>[2, 3, 5], 2=>[2, 3, 5]}

\$B\$3\$l\$K\$D\$\$\$F\$O!"(BHash.new([]) \$B\$[\$ILdBj\$G\$O\$"\$j\$^\$;\$s!#(B
\$B\$H\$\$\$&\$N\$O!"(Bseed \$B\$O\$=\$N\$^\$^7k2L\$K8=\$l\$k\$3\$H\$O\$J\$/!"(B
:op (\$B\$d(B :update) \$B\$rI,\$:DL\$k\$+\$i\$G\$9!#(B

\$B<B:]!"%I%-%e%a%s%H\$K=q\$\$\$F\$"\$j\$^\$9\$,!"%G%U%)%k%H\$NF0:n\$NG[[email protected]@.\$O(B
:seed => nil
:op => lambda {|s, v| !s ? [v] : (s << v) }
\$B\$N\$h\$&\$K\$7\$F<B8=2DG=\$G!"%+%F%4%jKh\$KG[Ns\$r:n\$j!"(B
\$BGK2uE*\$KDI2C\$7\$F\$\$\$/\$3\$H\$O2DG=\$G\$9!#(B

\$B\$^\$?!"(Bseed \$B\$K;XDj\$7\$FLdBj\$K\$J\$k\$b\$N\$P\$+\$j\$G\$O\$J\$\$\$N\$G!"(B
:seed \$B\$r(B Proc \$B\$K\$9\$k\$N\$OH?BP\$G\$9!#(B
\$B\$?\$H\$([email protected]?t\$r;XDj\$7\$?\$\$\$H\$-\$K\$O(B Proc
\$B\$G;XDj\$9\$k\$N\$OL5BL\$G\$7\$g\$&!#(B

:seed \$B\$G\$J\$\$%*%W%7%g%s(B (\$B\$?\$H\$(\$P(B :seed_proc)
\$B\$G!"\$"\$k%+%F%4%j\$N:G=i\$N(B
\$BMWAG\$,8+\$D\$+\$C\$?\$H\$-\$K8F\$P\$l\$F(B seed [email protected]@.\$9\$k(B Proc
\$B\$r;XDj\$9\$k!"\$H\$\$\$&\$N\$O(B
\$B\$"\$j\$&\$k\$H;W\$\$\$^\$9!#(B

\$B\$7\$+\$7!"\$=\$l\$,\$J\$\$\$H\$J\$K\$+\$,\$G\$-\$J\$\$\$H\$\$\$&\$b\$N\$G\$O\$J\$/!"(B
\$BJXMx\$+\$I\$&\$+\$b\$h\$/\$o\$+\$i\$J\$\$\$N\$G!":#\$N\$H\$3\$m\$=\$l\$[\$I<f\$+\$l\$^\$;\$s!#(B

This looks astounding. Quick nit: is it #categorize or #associate? I
like #categorize
as a name for this more, but you’ve given code samples with #associate
as the working
title of the method.

I really like what you’ve presented here.

I have one idea for how to handle varying key lengths: I think I’d like
an option to (or the
default case to be, even, but at least an option) have a replacing,
mixed-mode result: both
values and mixed nesting is allowed, and you can replace a value with a
nesting level
if key duplication occurs. Example where the categories are fruits:

[ [:aaa, “plum”],
[:aaa, :bbb, “banana”],
[:aaa, :ccc, “lemon”],
[:foo, :bar, “pear”],
[:foo, “apple”],
[:zzz, “orange” ] ].categorize { |a| a }

should give:

{:aaa => {:bbb => “banana”, :ccc => “lemon” },
:foo => “apple” },
:zzz => “orange” }

It’s a neat way to provide a useful result, as well as to avoid the