# Forum: Ruby Distinct Sets (#225)

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
on 2009-11-21 03:24
```-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

The three rules of Ruby Quiz:

1.  Please do not post any solutions or spoiler discussion for this
quiz until 48 hours have elapsed from the time this message was
sent.

2.  Support Ruby Quiz by submitting ideas and responses
as often as you can.

3.  Enjoy!

Suggestion:  A [QUIZ] in the subject of emails about the problem
the original quiz message, if you can.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Suggestions?: http://rubyquiz.strd6.com/suggestions

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

## Distinct Sets (#225)

Aloha Rubyists,

This week's quiz comes from Ruby Quiz Suggestions MVP Martin DeMello[1].

[based on a surprisingly tricky stackoverflow problem]

You have an list of sets, which you want to transform by the following
method: if any two sets have a common element, merge them into a
single set. You will be left with a reduced list partitioning all the
elements into sets where every set is disjoint from every other.

For example:

Start: 0:[D E G]
1:[C J K M]
2:[K M]
3:[H]
4:[D H L R]
5:[G L]

merging 1 and 2 since they have K and M in common:
=> [D E G] [C J K M] [H] [D H L R] [G L]

merging 2 and 3 since they have H in common:
=> [D E G] [C J K M] [D H L R] [G L]

merging 0 and 2 (D)
=> [D E G H L R] [C J K M] [G L]

merging 0 and 2 (G, L)
=> [D E G H L R] [C J K M]

The tricky bit is to do it as efficiently as possible (in an
algorithmic sense; in actual ruby code the efficiency depends a lot on
which methods run in ruby and which in C), but even without that it's
a fun problem to solve.

Here are a few input/output pairs to help test your program:

[["G", "J", "N"], ["D", "F", "G", "K"], ["E", "H"], ["B", "C", "J",
"L", "Q"], ["C", "M"]]
=> [["B", "C", "D", "F", "G", "J", "K", "L", "M", "N", "Q"], ["E", "H"]]

[["A", "C", "E", "G", "H"], ["B", "I", "M"], ["E", "M", "O"]]
=> [["A", "B", "C", "E", "G", "H", "I", "M", "O"]]

[["D", "E", "J", "L"], ["F", "K"], ["L", "M"], ["I", "K"], ["I", "K"]]
=> [["D", "E", "J", "L", "M"], ["F", "I", "K"]]

[["B", "E", "L", "M"], ["B", "I", "L", "O", "P"], ["A", "J", "O",
"P"], ["A", "D", "F", "L"]]
=> [["A", "B", "D", "E", "F", "I", "J", "L", "M", "O", "P"]]

[["E", "G", "K"], ["A", "C", "I", "J", "N"], ["C", "J", "M", "N"]]
=> [["E", "G", "K"], ["A", "C", "I", "J", "M", "N"]]

[["A", "D", "E", "H"], ["D", "N", "P"], ["D", "I", "L", "P"]]
=> [["A", "D", "E", "H", "I", "L", "N", "P"]]

[["E", "F", "K", "N", "O"], ["A", "B", "C", "J", "P"]]
=> [["E", "F", "K", "N", "O"], ["A", "B", "C", "J", "P"]]

[["C", "H", "M"], ["D", "F", "L"], ["A", "E", "J", "O"], ["C", "H"],
["J", "K", "M"], ["A", "N", "Q", "T"]]
=> [["A", "C", "E", "H", "J", "K", "M", "N", "O", "Q", "T"], ["D", "F",
"L"]]

Have fun!

[1]: http://zem.novylen.net

P.S. I'm pretty swamped right now which is causing delay on
summarizing the recent quizzes. If anyone would like to write a guest
summary I would much appreciate it! Sometimes you never know where
help will come from until you ask for it. Also, special thanks to
Martin and everyone else who helps by submitting ideas and quizzes!```
on 2009-11-21 17:51
```On Fri, Nov 20, 2009 at 9:23 PM, Daniel Moore <yahivin@gmail.com> wrote:
> method: if any two sets have a common element, merge them into a
> single set. You will be left with a reduced list partitioning all the
> elements into sets where every set is disjoint from every other.

\$ ruby -v 225.rb
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0]
Started
.
Finished in 0.059557 seconds.

1 tests, 10 assertions, 0 failures, 0 errors

But, I cheated a bit in my assertions; sorting the expected and actual
values before asserting them equal:

class TestCase225 < Test::Unit::TestCase
def tests
assert_equal(
[
%w{ D E G H L R },
%w{ C J K M }
].sort,
distinct_sets(
[
%w{ D E G },
%w{ C J K M },
%w{ K M },
%w{ H },
%w{ D H L R },
%w{ G L }
]
).sort
)

assert_equal(
[
["B", "C", "D", "F", "G", "J", "K", "L", "M", "N", "Q"],
["E", "H"]
].sort,
distinct_sets(
[
["G", "J", "N"],
["D", "F", "G", "K"],
["E", "H"],
["B", "C", "J", "L", "Q"],
["C", "M"]
]
).sort
)

assert_equal(
[
["A", "B", "C", "E", "G", "H", "I", "M", "O"]
].sort,
distinct_sets(
[
["A", "C", "E", "G", "H"],
["B", "I", "M"],
["E", "M", "O"]
]
).sort
)

assert_equal(
[
["D", "E", "J", "L", "M"],
["F", "I", "K"]
].sort,
distinct_sets(
[
["D", "E", "J", "L"],
["F", "K"],
["L", "M"],
["I", "K"],
["I", "K"]
]
).sort
)

assert_equal(
[
["A", "B", "D", "E", "F", "I", "J", "L", "M", "O", "P"]
].sort,
distinct_sets(
[
["B", "E", "L", "M"],
["B", "I", "L", "O", "P"],
["A", "J", "O", "P"],
["A", "D", "F", "L"]
]
).sort
)

assert_equal(
[
["E", "G", "K"],
["A", "C", "I", "J", "M", "N"]
].sort,
distinct_sets(
[
["E", "G", "K"],
["A", "C", "I", "J", "N"],
["C", "J", "M", "N"]
]
).sort
)

assert_equal(
[
["A", "D", "E", "H", "I", "L", "N", "P"]
].sort,
distinct_sets(
[
["A", "D", "E", "H"],
["D", "N", "P"],
["D", "I", "L", "P"]
]
).sort
)

assert_equal(
[
["E", "F", "K", "N", "O"],
["A", "B", "C", "J", "P"]
].sort,
distinct_sets(
[
["E", "F", "K", "N", "O"],
["A", "B", "C", "J", "P"]
]
).sort
)

assert_equal(
[
["A", "C", "E", "H", "J", "K", "M", "N", "O", "Q", "T"],
["D", "F", "L"]
].sort,
distinct_sets(
[
["C", "H", "M"],
["D", "F", "L"],
["A", "E", "J", "O"],
["C", "H"],
["J", "K", "M"],
["A", "N", "Q", "T"]
]
).sort
)

assert_equal(
[
].sort,
distinct_sets(
[
]
).sort
)
end
end

Without the forced sorting:

\$ ruby -v 225-nocheat.rb
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0]
Started
F
Finished in 0.095481 seconds.

1) Failure:
tests(TestCase225) [225-nocheat.rb:21]:
<[["D", "E", "G", "H", "L", "R"], ["C", "J", "K", "M"]]> expected but
was
<[["C", "J", "K", "M"], ["D", "E", "G", "H", "L", "R"]]>.

1 tests, 1 assertions, 1 failures, 0 errors

So, I changed my method to reverse sort by set size:

\$ ruby -v 225-nocheat.rb
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0]
Started
F
Finished in 0.098535 seconds.

1) Failure:
tests(TestCase225) [225-nocheat.rb:97]:
<[["E", "G", "K"], ["A", "C", "I", "J", "M", "N"]]> expected but was
<[["A", "C", "I", "J", "M", "N"], ["E", "G", "K"]]>.

1 tests, 6 assertions, 1 failures, 0 errors

But, that doesn't exactly match all of the sample input/output.  So,
then I "cheated" and sorted both before asserting.

:-)```
on 2009-11-22 07:03
```On Nov 21, 2009, at 11:50 AM, brabuhr@gmail.com wrote:

>> [based on a surprisingly tricky stackoverflow problem]
> Started
> .
> Finished in 0.059557 seconds.
>
> 1 tests, 10 assertions, 0 failures, 0 errors
>
> But, I cheated a bit in my assertions; sorting the expected and actual
> values before asserting them equal:

I'm sure that there's a better way than mine, but it seems to work well.

ruby -v -rubygems distinct_sets_test.rb
ruby 1.8.6 (2008-08-11 patchlevel 287) [universal-darwin9.0]
/Library/Ruby/Gems/1.8/gems/thoughtbot-shoulda-2.9.1/lib/shoulda/
context.rb:4: warning: method redefined; discarding old contexts
Started
....................
Finished in 0.018326 seconds.

20 tests, 148 assertions, 0 failures, 0 errors

This includes all the tests given in the original quiz description and
all the tests from brabuhr@gmail.com, but using Shoulda and splitting
the one "tests" method with 10 assertions into 4 separate methods with
one assertion each and eliminating the duplicates.

http://gist.github.com/240457

I'm guessing the speed difference is due more to 1.8.2 v. 1.8.6 and
the actual machines than any significant difference in our algorithms.

-Rob

Rob Biedenharn    http://agileconsultingllc.com
Rob@AgileConsultingLLC.com```
on 2009-11-22 07:51
```> http://gist.github.com/240457

Both of your tests use rather small input sets. It would be
interesting to know how the solutions deal with input that contains
many (10, 50, 100, ....) sets and/or many different signs (not just
letters).```
on 2009-11-23 17:10
```On Nov 22, 2009, at 1:51 AM, lith wrote:

>> http://gist.github.com/240457
>
> Both of your tests use rather small input sets. It would be
> interesting to know how the solutions deal with input that contains
> many (10, 50, 100, ....) sets and/or many different signs (not just
> letters).

I accept your challenge!  The gist has been updated with sets that use
numbers and symbols as well as strings.  There are also some tests of
large sets (which worked fine, but getting the test setup by hand was
nasty).

ruby -rubygems distinct_sets_test.rb
Started
..........................
Finished in 1.237836 seconds.

26 tests, 202 assertions, 0 failures, 0 errors

The only change that I has to make was in how I sorted the final array
to account for symbols or mixed contents: Numerics compare "naturally"
with <=> and non-numeric or mixed are compared using the #to_s
representation.

-Rob

P.S. I could add my solution to the gist, too, but I'll give everyone
a chance to try the new tests first.

Rob Biedenharn    http://agileconsultingllc.com
Rob@AgileConsultingLLC.com```
on 2009-11-23 18:34
```On Mon, Nov 23, 2009 at 11:10 AM, Rob Biedenharn
<Rob@agileconsultingllc.com> wrote:
> sets (which worked fine, but getting the test setup by hand was nasty).
Thanks.

> ruby1.8 -v -rubygems distinct_sets_test.rb
ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
/var/lib/gems/1.8/gems/shoulda-2.10.2/lib/shoulda/context.rb:4:
warning: method redefined; discarding old contexts
Started
..................EEE.....
Finished in 3.424045 seconds.

1) Error:
test: non-uniform contents should handle matching on symbols.
(DistinctSetsTest):
ArgumentError: comparison of String with :bill failed
./distinct_sets.rb:7:in `sort'

2) Error:
test: non-uniform contents should handle mix of strings and symbols
(matching on string). (DistinctSetsTest):
ArgumentError: comparison of String with :bill failed
./distinct_sets.rb:7:in `sort'

3) Error:
test: non-uniform contents should handle mix of strings, numbers, and
symbols. (DistinctSetsTest):
ArgumentError: comparison of Fixnum with :emergency failed
./distinct_sets.rb:7:in `sort'

26 tests, 175 assertions, 0 failures, 3 errors

> The only change that I has to make was in how I sorted the final array to
> account for symbols or mixed contents: Numerics compare "naturally" with <=>
> and non-numeric or mixed are compared using the #to_s representation.

Simply not sorting:

26 tests, 202 assertions, 17 failures, 0 errors

Simply sort_by{to_s}:

26 tests, 202 assertions, 3 failures, 0 errors

More complex sort{}:

26 tests, 202 assertions, 0 failures, 0 errors```
on 2009-11-23 21:32
```Hey Rubyists!

I think I got quite a fast solution, using 1.9.

Here is my test file : http://pastie.org/711737
You just need to change METHOD and require your own file

So here is my best result for the 19tests:

Finished in 0.222688 seconds.
1 tests, 19 assertions, 0 failures, 0 errors, 0 skips

Without sorting(I just had to change the order of some tests of Rob)

Maye I should also say I'm using a 64bit ruby 1.9.2 ? Anyway I think
this
method is far faster than my others(about 100 times) and probably some
of
yours.

Enjoy the quiz,
Benoit

2009/11/23 <brabuhr@gmail.com>```
on 2009-11-23 23:26
```Here's my not-fast solution:

require 'set'

class Set
def intersect?(other)
other.each { |o| return true if include?(o) }
false
end
end

def distinct_sets(array_of_arrays)
set_of_sets = array_of_arrays.map{|a|
a.to_set
}.to_set

set_of_sets.divide{|i, j|
i.intersect?(j)
}.map{|s|
s.flatten.to_a
}
end

Adding the intersect? method to Set was primarily motivated by
readability, but also provided a noticeable speed improvement over my
original alternative (intersection.size.>).```
on 2009-11-24 00:08
```I must admit is a very elegant solution.
And it's not so slow, 0.36s with my tests and adding .map { |s|
s.flatten.to_a*.sort*
}, it pass all the tests without sorting.
Awesome for a Ruby-based class! A nice exemple of using forgot methods
like
divide.

2009/11/23 <brabuhr@gmail.com>```
on 2009-11-24 07:20
```> And it's not so slow,

I wrote a small approximative benchmark for that runs the script with
different set configurations.
http://pastie.org/711915

It might be interesting to compare the runtime behaviour of your
script with other solutions.```
on 2009-11-24 15:56
```On Tue, Nov 24, 2009 at 1:20 AM, lith <minilith@gmail.com> wrote:
>> And it's not so slow,
>
> I wrote a small approximative benchmark for that runs the script with
> different set configurations.
> http://pastie.org/711915

> cat /proc/cpuinfo | fgrep name
model name      : Intel(R) Pentium(R) M processor 1.60GHz
> uname -a
Linux eXist 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC
2009 i686 GNU/Linux
> ruby1.8 -v 711915.rb 225.rb
ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
user     system
total        real
elements * sets: 1 * 5 => 1 * 1           0.010000   0.000000
0.010000 (  0.010348)
elements * sets: 1 * 5 => 2 * 1           0.010000   0.000000
0.010000 (  0.008285)
elements * sets: 2 * 5 => 3 * 1           0.010000   0.000000
0.010000 (  0.015534)
elements * sets: 2 * 5 => 4 * 1           0.020000   0.000000
0.020000 (  0.020882)
elements * sets: 2 * 5 => 5 * 1           0.020000   0.010000
0.030000 (  0.023482)
elements * sets: 3 * 5 => 6 * 1           0.020000   0.000000
0.020000 (  0.021767)
elements * sets: 3 * 5 => 7 * 1           0.040000   0.000000
0.040000 (  0.035285)
elements * sets: 3 * 5 => 8 * 1           0.020000   0.010000
0.030000 (  0.028141)
elements * sets: 4 * 5 => 9 * 1           0.020000   0.000000
0.020000 (  0.028562)
elements * sets: 4 * 5 => 10 * 1          0.020000   0.010000
0.030000 (  0.029465)
elements * sets: 18 * 255 => 1 * 51       0.780000   0.140000
0.920000 (  0.922092)
elements * sets: 35 * 255 => 2 * 51       1.330000   0.190000
1.520000 (  1.526110)
elements * sets: 52 * 255 => 3 * 51       1.110000   0.220000
1.330000 (  1.338236)
elements * sets: 69 * 255 => 4 * 51       1.670000   0.280000
1.950000 (  1.942574)
elements * sets: 86 * 255 => 5 * 51       1.440000   0.320000
1.760000 (  1.763389)
elements * sets: 103 * 255 => 6 * 51      1.820000   0.330000
2.150000 (  2.147160)
elements * sets: 120 * 255 => 7 * 51      1.720000   0.430000
2.150000 (  2.165951)
elements * sets: 137 * 255 => 8 * 51      2.300000   0.480000
2.780000 (  2.774613)
elements * sets: 154 * 255 => 9 * 51      2.180000   0.390000
2.570000 (  2.571221)
elements * sets: 171 * 255 => 10 * 51     2.430000   0.500000
2.930000 (  2.921943)
elements * sets: 34 * 505 => 1 * 101      2.550000   0.650000
3.200000 (  3.199823)
elements * sets: 68 * 505 => 2 * 101      4.540000   1.000000
5.540000 (  5.547267)
elements * sets: 102 * 505 => 3 * 101     3.980000   0.740000
4.720000 (  4.744038)
elements * sets: 135 * 505 => 4 * 101     5.890000   1.210000
7.100000 (  7.099683)
elements * sets: 169 * 505 => 5 * 101     5.170000   1.120000
6.290000 (  6.288145)
elements * sets: 203 * 505 => 6 * 101     6.660000   1.510000
8.170000 ( 11.120129)
elements * sets: 236 * 505 => 7 * 101     6.470000   1.360000
7.830000 (  7.843291)
elements * sets: 270 * 505 => 8 * 101     8.390000   1.830000
10.220000 ( 10.233364)
elements * sets: 304 * 505 => 9 * 101     7.750000   1.630000
9.380000 (  9.377075)
elements * sets: 337 * 505 => 10 * 101    8.730000   2.000000
10.730000 ( 10.752132)

And another one-off benchmark I had run:

per
sets     set      user     system      total        real
10,      1  0.000000   0.000000   0.000000 (  0.005441)
10,     10  0.010000   0.000000   0.010000 (  0.006944)
10,    100  0.040000   0.010000   0.050000 (  0.044136)
10,   1000  0.310000   0.060000   0.370000 (  0.372266)
10,  10000  1.620000   0.090000   1.710000 (  1.718613)
10, 100000 18.700000   1.020000  19.720000 ( 19.869159)
user     system      total        real
1,     10  0.000000   0.000000   0.000000 (  0.002515)
10,     10  0.010000   0.000000   0.010000 (  0.006701)
100,     10  0.370000   0.080000   0.450000 (  0.454811)
1000,     10 36.210000   8.340000  44.550000 ( 44.798496)
user     system      total        real
1,      1  0.000000   0.000000   0.000000 (  0.000435)
10,     10  0.010000   0.000000   0.010000 (  0.006239)
100,    100  2.900000   0.750000   3.650000 (  3.762373)

(Sets were filled with random integers.)```
on 2009-11-25 17:59
```>   set_of_sets.divide{|i, j|
>     i.intersect?(j)
>   }

I didn't know the Set#divide method. Interesting.

Here is a graph-based approach:
http://pastie.org/714759

Regards,
Tom```
on 2009-11-25 20:34
```On Wed, Nov 25, 2009 at 11:59 AM, lith <minilith@gmail.com> wrote:
> Here is a graph-based approach:
> http://pastie.org/714759

Running the small approximative benchmark:

> fgrep name /proc/cpuinfo
model name      : Intel(R) Pentium(R) M processor 1.60GHz
> uname -a
Linux eXist 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC
2009 i686 GNU/Linux
> ruby1.8 -v 711915.rb 714759.rb
ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
user     system
total        real
elements * sets: 1 * 5 => 1 * 1           0.000000   0.000000
0.000000 (  0.003731)
elements * sets: 1 * 5 => 2 * 1           0.000000   0.000000
0.000000 (  0.002467)
elements * sets: 2 * 5 => 3 * 1           0.000000   0.000000
0.000000 (  0.003154)
elements * sets: 2 * 5 => 4 * 1           0.010000   0.000000
0.010000 (  0.004540)
elements * sets: 2 * 5 => 5 * 1           0.000000   0.000000
0.000000 (  0.005251)
elements * sets: 3 * 5 => 6 * 1           0.010000   0.000000
0.010000 (  0.004601)
elements * sets: 3 * 5 => 7 * 1           0.000000   0.000000
0.000000 (  0.006505)
elements * sets: 3 * 5 => 8 * 1           0.000000   0.010000
0.010000 (  0.008302)
elements * sets: 4 * 5 => 9 * 1           0.000000   0.000000
0.000000 (  0.008121)
elements * sets: 4 * 5 => 10 * 1          0.010000   0.010000
0.020000 (  0.008174)
elements * sets: 18 * 255 => 1 * 51       0.070000   0.000000
0.070000 (  0.068440)
elements * sets: 35 * 255 => 2 * 51       0.120000   0.020000
0.140000 (  0.145098)
elements * sets: 52 * 255 => 3 * 51       0.220000   0.030000
0.250000 (  0.253006)
elements * sets: 69 * 255 => 4 * 51       0.310000   0.060000
0.370000 (  0.375845)
elements * sets: 86 * 255 => 5 * 51       0.450000   0.070000
0.520000 (  0.516071)
elements * sets: 103 * 255 => 6 * 51      0.580000   0.090000
0.670000 (  0.680203)
elements * sets: 120 * 255 => 7 * 51      0.740000   0.130000
0.870000 (  0.864627)
elements * sets: 137 * 255 => 8 * 51      0.920000   0.160000
1.080000 (  1.082008)
elements * sets: 154 * 255 => 9 * 51      1.190000   0.130000
1.320000 (  1.319013)
elements * sets: 171 * 255 => 10 * 51     1.380000   0.190000
1.570000 (  1.569024)
elements * sets: 34 * 505 => 1 * 101      0.140000   0.010000
0.150000 (  0.151603)
elements * sets: 68 * 505 => 2 * 101      0.260000   0.060000
0.320000 (  0.336318)
elements * sets: 102 * 505 => 3 * 101     0.490000   0.060000
0.550000 (  0.537813)
elements * sets: 135 * 505 => 4 * 101     0.700000   0.090000
0.790000 (  0.802585)
elements * sets: 169 * 505 => 5 * 101     0.950000   0.160000
1.110000 (  1.106535)
elements * sets: 203 * 505 => 6 * 101     1.240000   0.220000
1.460000 (  1.456670)
elements * sets: 236 * 505 => 7 * 101     1.590000   0.260000
1.850000 (  1.850735)
elements * sets: 270 * 505 => 8 * 101     2.030000   0.260000
2.290000 (  2.294973)
elements * sets: 304 * 505 => 9 * 101     2.350000   0.430000
2.780000 (  2.784864)
elements * sets: 337 * 505 => 10 * 101    2.860000   0.470000
3.330000 (  3.335518)

And a little one-off benchmark (sets of random integers):

per
sets   set      user     system      total        real
10,    1  0.000000   0.000000   0.000000 (  0.000776)
10,   10  0.000000   0.000000   0.000000 (  0.005667)
10,  100  0.350000   0.030000   0.380000 (  0.384304)
10, 1000        Timeout::Error - execution expired (> 3 minutes)
user     system      total        real
1,   10  0.000000   0.000000   0.000000 (  0.000743)
10,   10  0.000000   0.000000   0.000000 (  0.009329)
100,   10  0.040000   0.010000   0.050000 (  0.056932)
1000,   10  5.180000   0.070000   5.250000 (  7.455180)
10000,   10  8.460000   0.510000   8.970000 ( 12.744708)
100000,   10      Timeout::Error - execution expired (> 3 minutes)
user     system      total        real
1,    1  0.010000   0.000000   0.010000 (  0.003573)
10,   10  0.010000   0.000000   0.010000 (  0.008709)
100,  100 28.750000   0.350000  29.100000 ( 42.588412)
1000, 1000      Timeout::Error - execution expired (> 3 minutes)```
on 2009-11-25 22:24
```<brabuhr@gmail.com>:
>> Here's my not-fast solution:
Benoit Daloze <eregontp@gmail.com>:
> I must admit is a very elegant solution.

Here's a fresh non-elegant solution:

def distinct_sets(sets)
sets = sets.dup
h1 = {}; h2 = {}
sets.each{|s| h1[s.object_id] = s.dup; s.each{|e| (h2[e] ||= []) <<
s.object_id}}
merges = h2.select{|_, ids| ids.size > 1}.map{|_, ids| ids}
return sets.sort.uniq if merges.size == 0
flag = true
while flag
flag = false
merges = h1.keys.map{|id|
merges.select{|m| m.include?(id)}.tap{|m| flag = true if m.size
> 1}.flatten.uniq
}.uniq
end
result = []
merges.each{|m| result << m.map{|id| s = h1[id]; h1.delete(id);
s}.flatten.sort.uniq }
(result + h1.values).sort.uniq
end

Slight bug in this version, (sometimes) adds an empty set to "result",
e.g.:
[ [], ["B", "C", "D", "F", "G", "J", "K", "L", "M", "N", "Q"], ["E",
"H"] ]

ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
user     system
total        real
elements * sets: 1 * 5 => 1 * 1           0.010000   0.000000
0.010000 (  0.008809)
elements * sets: 1 * 5 => 2 * 1           0.000000   0.000000
0.000000 (  0.008136)
elements * sets: 2 * 5 => 3 * 1           0.010000   0.000000
0.010000 (  0.007661)
elements * sets: 2 * 5 => 4 * 1           0.010000   0.000000
0.010000 (  0.007797)
elements * sets: 2 * 5 => 5 * 1           0.010000   0.000000
0.010000 (  0.008351)
elements * sets: 3 * 5 => 6 * 1           0.010000   0.000000
0.010000 (  0.008181)
elements * sets: 3 * 5 => 7 * 1           0.010000   0.000000
0.010000 (  0.011584)
elements * sets: 3 * 5 => 8 * 1           0.010000   0.000000
0.010000 (  0.011224)
elements * sets: 4 * 5 => 9 * 1           0.010000   0.000000
0.010000 (  0.011744)
elements * sets: 4 * 5 => 10 * 1          0.020000   0.000000
0.020000 (  0.014110)
elements * sets: 18 * 255 => 1 * 51       0.630000   0.100000
0.730000 (  0.752032)
elements * sets: 35 * 255 => 2 * 51       1.760000   0.230000
1.990000 (  2.104564)
elements * sets: 52 * 255 => 3 * 51       2.230000   0.330000
2.560000 (  2.578473)
elements * sets: 69 * 255 => 4 * 51       2.760000   0.400000
3.160000 (  3.179041)
elements * sets: 86 * 255 => 5 * 51       3.230000   0.520000
3.750000 (  3.770260)
elements * sets: 103 * 255 => 6 * 51      3.740000   0.560000
4.300000 (  4.326179)
elements * sets: 120 * 255 => 7 * 51      4.250000   0.630000
4.880000 (  4.919036)
elements * sets: 137 * 255 => 8 * 51      4.870000   0.700000
5.570000 (  5.998519)
elements * sets: 154 * 255 => 9 * 51      5.350000   0.710000
6.060000 (  6.334980)
elements * sets: 171 * 255 => 10 * 51     5.960000   0.700000
6.660000 (  6.852846)
elements * sets: 34 * 505 => 1 * 101      2.200000   0.310000
2.510000 (  2.528937)
elements * sets: 68 * 505 => 2 * 101      6.230000   0.860000
7.090000 (  7.158239)
elements * sets: 102 * 505 => 3 * 101     8.130000   1.290000
9.420000 ( 10.024654)
elements * sets: 135 * 505 => 4 * 101    10.190000   1.370000
11.560000 ( 12.229713)
elements * sets: 169 * 505 => 5 * 101    12.060000   1.710000
13.770000 ( 14.657392)
elements * sets: 203 * 505 => 6 * 101    13.870000   2.060000
15.930000 ( 16.317533)
elements * sets: 236 * 505 => 7 * 101    15.860000   2.310000
18.170000 ( 18.707348)
elements * sets: 270 * 505 => 8 * 101    17.610000   2.750000
20.360000 ( 20.495151)
elements * sets: 304 * 505 => 9 * 101    19.780000   2.880000
22.660000 ( 23.630379)
elements * sets: 337 * 505 => 10 * 101   22.020000   2.980000
25.000000 ( 26.895880)

per
set set user     system      total        real
10, 1  0.000000   0.000000   0.000000 (  0.000205)
10, 10  0.000000   0.000000   0.000000 (  0.000565)
10, 100  0.010000   0.000000   0.010000 (  0.004842)
10, 1000  0.070000   0.000000   0.070000 (  0.069138)
10, 10000  0.770000   0.100000   0.870000 (  0.872998)
10, 100000 18.600000   1.130000  19.730000 ( 19.776977)
user     system      total        real
1, 10  0.000000   0.000000   0.000000 (  0.000126)
10, 10  0.000000   0.000000   0.000000 (  0.000569)
100, 10  0.010000   0.000000   0.010000 (  0.007341)
1000, 10  1.360000   0.040000   1.400000 (  1.419874)
10000, 10 (> 3 minutes)
user     system      total        real
1, 1  0.000000   0.000000   0.000000 (  0.000044)
10, 10  0.000000   0.000000   0.000000 (  0.000587)
100, 100  0.080000   0.010000   0.090000 (  0.077950)
1000, 1000 (> 3 minutes)```
on 2009-11-25 23:02
```<brabuhr@gmail.com>:
>>> Here's my not-fast solution:
Benoit Daloze <eregontp@gmail.com>:
>> I must admit is a very elegant solution.
<brabuhr@gmail.com> wrote:
> Here's a fresh non-elegant solution:

The last half of the previous one can be applied directly to the input
array of arrays of items instead of to the intermediate array of
arrays of object ids:

def distinct_sets(sets)
sets = sets.dup
values = sets.flatten.sort.uniq
flag = true
while flag
flag = false
sets = values.map{|v|
sets.select{|s|
s.include?(v)
}.tap{|s|
flag = true if s.size > 1
}.flatten.sort.uniq
}.uniq
end
sets
end

Easier code, but generally scales more poorly than the previous more
complex version:

ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
user     system
total        real
elements * sets: 1 * 5 => 1 * 1           0.000000   0.000000
0.000000 (  0.001928)
elements * sets: 1 * 5 => 2 * 1           0.000000   0.000000
0.000000 (  0.002136)
elements * sets: 2 * 5 => 3 * 1           0.000000   0.000000
0.000000 (  0.003218)
elements * sets: 2 * 5 => 4 * 1           0.000000   0.000000
0.000000 (  0.003926)
elements * sets: 2 * 5 => 5 * 1           0.010000   0.000000
0.010000 (  0.007761)
elements * sets: 3 * 5 => 6 * 1           0.010000   0.000000
0.010000 (  0.012438)
elements * sets: 3 * 5 => 7 * 1           0.010000   0.000000
0.010000 (  0.009863)
elements * sets: 3 * 5 => 8 * 1           0.010000   0.000000
0.010000 (  0.015234)
elements * sets: 4 * 5 => 9 * 1           0.020000   0.000000
0.020000 (  0.022202)
elements * sets: 4 * 5 => 10 * 1          0.020000   0.000000
0.020000 (  0.025353)
elements * sets: 18 * 255 => 1 * 51       0.400000   0.120000
0.520000 (  0.514576)
elements * sets: 35 * 255 => 2 * 51       0.970000   0.180000
1.150000 (  1.280102)
elements * sets: 52 * 255 => 3 * 51       1.690000   0.210000
1.900000 (  2.260107)
elements * sets: 69 * 255 => 4 * 51       2.310000   0.390000
2.700000 (  2.888354)
elements * sets: 86 * 255 => 5 * 51       3.220000   0.460000
3.680000 (  4.250457)
elements * sets: 103 * 255 => 6 * 51      4.090000   0.590000
4.680000 (  4.723892)
elements * sets: 120 * 255 => 7 * 51      5.200000   0.600000
5.800000 (  6.258614)
elements * sets: 137 * 255 => 8 * 51      6.330000   0.700000
7.030000 (  7.694748)
elements * sets: 154 * 255 => 9 * 51      7.470000   0.870000
8.340000 (  8.730147)
elements * sets: 171 * 255 => 10 * 51     8.970000   0.820000
9.790000 (  9.871180)
elements * sets: 34 * 505 => 1 * 101      1.720000   0.250000
1.970000 (  2.331276)
elements * sets: 68 * 505 => 2 * 101      3.620000   0.750000
4.370000 (  4.983125)
elements * sets: 102 * 505 => 3 * 101     6.110000   0.920000
7.030000 (  7.258356)
elements * sets: 135 * 505 => 4 * 101     8.730000   1.460000
10.190000 ( 10.912860)
elements * sets: 169 * 505 => 5 * 101    12.130000   1.630000
13.760000 ( 14.714311)
elements * sets: 203 * 505 => 6 * 101    15.280000   2.040000
17.320000 ( 17.556731)
elements * sets: 236 * 505 => 7 * 101    19.090000   2.490000
21.580000 ( 22.005613)
elements * sets: 270 * 505 => 8 * 101    23.430000   2.620000
26.050000 ( 26.366511)
elements * sets: 304 * 505 => 9 * 101    28.090000   3.020000
31.110000 ( 32.421622)
elements * sets: 337 * 505 => 10 * 101   33.270000   3.550000
36.820000 ( 39.281699)

per
set set user     system      total        real
10, 1  0.000000   0.000000   0.000000 (  0.000338)
10, 10  0.010000   0.000000   0.010000 (  0.006647)
10, 100  0.320000   0.000000   0.320000 (  0.326307)
10, 1000 (>3 minutes)
user     system      total        real
1, 10  0.000000   0.000000   0.000000 (  0.000311)
10, 10  0.010000   0.000000   0.010000 (  0.006137)
100, 10  0.290000   0.020000   0.310000 (  0.305198)
1000, 10 77.240000   8.000000  85.240000 ( 90.868828)
10000, 10 (>3 minutes)
user     system      total        real
1, 1  0.000000   0.000000   0.000000 (  0.000052)
10, 10  0.000000   0.000000   0.000000 (  0.006212)
100, 100 84.440000   1.080000  85.520000 ( 91.856640)
1000, 1000 (>3 minutes)```
on 2009-11-25 23:32
```Sorry for all the posts :-) I think I'm done now.  I refined the
complex, non-elegant solution a bit more; so, to make it easy to
reference later, here are my three main solutions:

First:

require 'set'

class Set
def intersect?(other)
other.each { |o| return true if include?(o) }
false
end
end

def distinct_sets(array_of_arrays)
set_of_sets = array_of_arrays.map{|a|
a.to_set
}.to_set

set_of_sets.divide{|i, j|
i.intersect?(j)
}.map{|s|
s.flatten.to_a.sort
}
end

Third:

def distinct_sets(sets)
sets = sets.dup
values = sets.flatten.sort.uniq
flag = true
while flag
flag = false
sets = values.map{|v|
sets.select{|s|
s.include?(v)
}.tap{|s|
flag = true if s.size > 1
}.flatten.sort.uniq
}.uniq
end
sets
end

Second (from the earlier post):

def distinct_sets(sets)
sets = sets.dup
h1 = {}; h2 = {}
sets.each{|s|
h1[s.object_id] = s.dup
s.each{|e| (h2[e] ||= []) << s.object_id}
}
merges = h2.select{|_, ids|
ids.size > 1
}.map{|_, ids| ids}
return sets.sort.uniq if merges.size == 0
flag = true
while flag
flag = false
merges = h1.keys.map{|id|
merges.select{|m|
m.include?(id)
}.tap{|m|
flag = true if m.size > 1
}.flatten.uniq
}.uniq
end
result = []
merges.each{|m|
result << m.map{|id|
s = h1[id]; h1.delete(id); s
}.flatten.sort.uniq
}
(result + h1.values).sort.uniq
end

Second (refined version):

def distinct_sets(sets)
sets = sets.dup
h1 = {}; h2 = {}
sets.each{|s|
h1[s.object_id] = s.dup
s.each{|e| (h2[e] ||= []) << s.object_id}
}
merges = h2.values.sort.uniq
flag = true
while flag
flag = false
merges = h1.keys.map{|id|
merges.select{|m|
m.include?(id)
}.tap{|m|
flag = true if m.size > 1
}.flatten.sort.uniq
}.sort.uniq
end
merges.map{|m|
m.map{|id|
s = h1[id]; h1.delete(id); s
}.flatten.sort.uniq
}
end```
on 2009-11-26 08:08
```> > Here is a graph-based approach:
> >http://pastie.org/714759
>
> Running the small approximative benchmark:

Here is a modified version that should have slightly improved runtime
characteristics:
http://pastie.org/715755

Regards,
Tom```
on 2009-11-26 12:22
```Here is my array-based combination solution:

def multiple(start)
sets = start.uniq
while (f=sets.flatten) && f != f.uniq
sets.combination(2) { |(a, b)|
if sets.include?(a) && sets.include?(b) && a != b && (a &
b).length > 0
# include? ensure the set has not been mixed with
another
# a != b ensure we are not playing with a == b, what
would
delete a (and b) or not find the index
sets[sets.index(a)] = (a | b)
sets.delete(b)
end
}
end
sets.map { |s| s.sort }
end

It just combinate by 2 sets, and look if they can merge.
The while loop is then rarely met, because sets merge 2 by 2.

This solution is quite fast for small sets(as I said before, 0.22 for
the
first test), but is completely out for larger sets.

This is another, using Array#partition to modify itself, while merging
all
the elements with the common value in one iteration

def better(start)
sets = start.dup
f = sets.flatten
(f.uniq.select { |e| f.count(e)>1 }).each { |reducing_on|
i = sets.index { |set| set.include? reducing_on }
sets2merge, sets = sets.partition { |set| set.include?
reducing_on }
sets.insert( i, sets2merge.inject(:|) )
}
sets.map { |s| s.sort }
end

2009/11/26 lith <minilith@gmail.com>```
on 2009-12-10 15:35
```Long time listener, first time caller.

Here's my solution:
http://github.com/mustmodify/ruby-quiz/tree/master...

E:\projects\ruby-quiz\225_distinct_sets>ruby test.rb
Started
.......
Finished in 0.008 seconds.

7 tests, 14 assertions, 0 failures, 0 errors

Of course, that's just my few tests and the original examples. I
haven't worked out the sorting issue yet, but here's what I get for
the extended tests:

E:\projects\ruby-quiz\225_distinct_sets>ruby extended_tests.rb
Started
..................EEE.....
Finished in 0.386 seconds.
26 tests, 175 assertions, 0 failures, 3 errors```
on 2010-04-08 23:16
```Many members of the Ruby Community contributed solutions to this quiz.
Some
long time veterans as well as first time contributors. Thanks everyone
for
the great turnout!

`Set#divide` is an interesting method that came up during the
discussion. I
was not previously familiar with it, time to learn.

>     numbers = Set[1, 3, 4, 6, 9, 10, 11]
>     set = numbers.divide { |i,j| (i - j).abs == 1 }
>     p set   # => #<Set: {#<Set: {1}>,
>             #            #<Set: {11, 9, 10}>,
>             #            #<Set: {3, 4}>,
>             #            #<Set: {6}>}>

I didn't quite get it at first so I went to the console and tried some
other
examples.

set = numbers.divide { |i,j| (i - j).abs == 2 }
=> #<Set: {#<Set: {10}>, #<Set: {1, 3}>, #<Set: {6, 4}>, #<Set: {11,
9}>}>

Ok, so the first example gets contiguous runs (numbers that are 1
apart),
and the second example gets contiguous skip runs (runs of numbers that
are 2
apart).

Now to test out the single argument version:

set = numbers.divide { |i| i%2 }
=> #<Set: {#<Set: {11, 1, 3, 9}>, #<Set: {6, 4, 10}>}>

Dividing a set into odds and evens. A core component of this quiz is
grouping sets; this may come in handy.

brabuhr's first solution uses this method and is a good illustration of
the
principle behind the problem.

require 'set'

class Set
def intersect?(other)
other.each { |o| return true if include?(o) }
false
end
end

def distinct_sets(array_of_arrays)
set_of_sets = array_of_arrays.map{|a|
a.to_set
}.to_set

set_of_sets.divide{|i, j|
i.intersect?(j)
}.map{|s|
s.flatten.to_a
}
end

In this solution an instance method `intersect?` is added to `Set`. This
allows us to `divide` all the sets that share an element into groups.
Then
all that is left is to merge the groups of sets (`Set#flatten` takes
care of
that) and to present the result as an array of arrays to match how the
output was specified in the quiz.

During the quiz discussion a full set of test cases was developed. This
enabled everyone to check and verify the accuracy of their solutions.
The
test suite was provided by Rob Biedenharn and uses Shoulda[1], a testing
framework that provides additional helpers, macros, and assertions to
the
Test::Unit framework.

Another benefit the testing provided was the ability to focus on the
speed
at which the solutions run. When you have a full test suite you can
modify
code without fear of breaking things in order to optimize and squeeze
out
that last bit of speed, or conversely, to clean things up to improve
code
readability, knowing that you have a safety net of tests to catch any
errors
introduced.

There were many, many more solutions to this week's quiz. The principle
of
grouping and merging the sets is followed by all solutions, with varying
time
correspondent Johnathon Wright had one. Please be sure to take a look
inside
the archived files, there are lots of good solutions in there.

Special thanks to everyone who participated in the quiz!

Distinct Sets (#225) - Solutions[3]

[1]: http://github.com/thoughtbot/shoulda
[2]: http://ruby-doc.org/core/classes/Set.html#M001626
[3]: http://rubyquiz.strd6.com/quizzes/225.tar.gz```
on 2010-04-12 04:22
```Daniel,
Â
I would like to know how to install ruby and rails on my computer.
but the trouble that I am having is that all the files are zipped up
and can use them. if you have a sloution for this email me back.
Â
JamesÂ

--- On Thu, 4/8/10, Daniel Moore <yahivin@gmail.com> wrote:

From: Daniel Moore <yahivin@gmail.com>
Subject: [QUIZ][SUMMARY] Distinct Sets (#225)
To: "ruby-talk ML" <ruby-talk@ruby-lang.org>
Date: Thursday, April 8, 2010, 3:16 PM

Many members of the Ruby Community contributed solutions to this quiz.
Some
long time veterans as well as first time contributors. Thanks everyone
for
the great turnout!

`Set#divide` is an interesting method that came up during the
discussion. I
was not previously familiar with it, time to learn.

>Â  Â Â Â numbers = Set[1, 3, 4, 6, 9, 10, 11]
>Â  Â Â Â set = numbers.divide { |i,j| (i - j).abs == 1 }
>Â  Â Â Â p setÂ Â Â # => #<Set: {#<Set: {1}>,
>Â  Â  Â  Â  Â  Â Â Â #Â  Â  Â  Â  Â  Â  #<Set: {11, 9, 10}>,
>Â  Â  Â  Â  Â  Â Â Â #Â  Â  Â  Â  Â  Â  #<Set: {3, 4}>,
>Â  Â  Â  Â  Â  Â Â Â #Â  Â  Â  Â  Â  Â  #<Set: {6}>}>

I didn't quite get it at first so I went to the console and tried some
other
examples.

Â  Â  set = numbers.divide { |i,j| (i - j).abs == 2 }
Â  Â  => #<Set: {#<Set: {10}>, #<Set: {1, 3}>, #<Set: {6, 4}>, #<Set: {11,
9}>}>

Ok, so the first example gets contiguous runs (numbers that are 1
apart),
and the second example gets contiguous skip runs (runs of numbers that
are 2
apart).

Now to test out the single argument version:

Â  Â  set = numbers.divide { |i| i%2 }
Â  Â  => #<Set: {#<Set: {11, 1, 3, 9}>, #<Set: {6, 4, 10}>}>

Dividing a set into odds and evens. A core component of this quiz is
grouping sets; this may come in handy.

brabuhr's first solution uses this method and is a good illustration of
the
principle behind the problem.

Â  Â  require 'set'

Â  Â  class Set
Â  Â  Â  def intersect?(other)
Â  Â  Â  Â  other.each { |o| return true if include?(o) }
Â  Â  Â  Â  false
Â  Â  Â  end
Â  Â  end

Â  Â  def distinct_sets(array_of_arrays)
Â  Â  Â  set_of_sets = array_of_arrays.map{|a|
Â  Â  Â  Â  a.to_set
Â  Â  Â  }.to_set

Â  Â  Â  set_of_sets.divide{|i, j|
Â  Â  Â  Â  i.intersect?(j)
Â  Â  Â  }.map{|s|
Â  Â  Â  Â  s.flatten.to_a
Â  Â  Â  }
Â  Â  end

In this solution an instance method `intersect?` is added to `Set`. This
allows us to `divide` all the sets that share an element into groups.
Then
all that is left is to merge the groups of sets (`Set#flatten` takes
care of
that) and to present the result as an array of arrays to match how the
output was specified in the quiz.

During the quiz discussion a full set of test cases was developed. This
enabled everyone to check and verify the accuracy of their solutions.
The
test suite was provided by Rob Biedenharn and uses Shoulda[1], a testing
framework that provides additional helpers, macros, and assertions to
the
Test::Unit framework.

Another benefit the testing provided was the ability to focus on the
speed
at which the solutions run. When you have a full test suite you can
modify
code without fear of breaking things in order to optimize and squeeze
out
that last bit of speed, or conversely, to clean things up to improve
code
readability, knowing that you have a safety net of tests to catch any
errors
introduced.

There were many, many more solutions to this week's quiz. The principle
of
grouping and merging the sets is followed by all solutions, with varying