# Forum: Ruby Getting the smallest Items of an Array

on 2012-11-27 18:04
```Hello guys,

I have been working with ranking algorithm, wich collects the smallest
numbers chronologically. but i couldn't make it work.

Here is the code.

module Enumerable
def min_n(n, &block)
block = Proc.new { |x,y| x <=> y} if block == nil
stable = StortedArray.new(&block)
each do |x|
stable << x if stable.size < n or block.call(x, stable[-1]) == -1
stable.pop until stable.size <=n
end
return stable

end

end

m = [1,5,2,8,7,9,10,100,50]
m.min_n(4)

I would be more than glad to get your feedback/suggestions of how to
make it work.```
on 2012-11-27 18:40
```Hi,

so what does "not work" mean? I don't know your StortedArray class (or
rather SortedArray?), but if it works like I think it works, then your
method should indeed output the four smallest elements:

[1, 2, 5, 7]

But what's all this effort for? Why not simply use Ruby's built-in
methods for that?

smallest = m.sort.first 4

This might not be particularly efficient, but your method isn't either.```
on 2012-11-28 15:06
`Thank you very much`
on 2012-11-29 12:02
```>> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end
=> nil
>> n_min([1,2,-1,3,4],2)
=> [-1, 1]

>> l=(1..100000).to_a.sort { |a,b| rand(2)-1 }
>> chrono(100) { l.sort[0..3] }
14 millis

>> l=(1..100000).to_a.sort { |a,b| rand(2)-1 }
>> chrono(100) { n_min(l,4) }
54 millis```
on 2012-11-29 13:38
```2012/11/29 Regis d'Aubarede <lists@ruby-forum.com>:
>>> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end
> => nil
>>> n_min([1,2,-1,3,4],2)
> => [-1, 1]

-- Matma Rex```
on 2012-11-29 14:09
```On Thu, Nov 29, 2012 at 1:37 PM, Bartosz Dziewoński
<matma.rex@gmail.com> wrote:
> 2012/11/29 Regis d'Aubarede <lists@ruby-forum.com>:
>>>> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end
>> => nil
>>>> n_min([1,2,-1,3,4],2)
>> => [-1, 1]
>

You lose the relative ordering of the chosen elements. Don't know if
it's important or not.

Jesus.```
on 2012-11-29 14:26
```Try this:
n = [4, 6, 7, 2, 15]

def min(n)
minimum = n[0]
n.each do |element|
if element < minimum
minimum = element
end
end
return minimum
end

puts min(n)```
on 2012-11-29 15:00
```"Иван Бишевац" <ivan.bisevac@gmail.com> wrote in post #1087106:
> def min(n)
> minimum = n[0]
> n.each do |element|
>  if element < minimum
> minimum = element
> end
>  end
>  return minimum
> end

Great, you've reimplemented Array#min.

Problem is, this has nothing to do with what the OP wants.```
on 2012-11-29 15:32
```> Uhh... how about ary.sort.first(n)?

If sort complexity is O(n*log(n), i thought that sorting the complete
array was less smart than my solution which  should be
O(m*n) / m=number of minus item selected.

but it is not true:
nlog(n) with n=100_000           >> 1_115_000
m*n     with n=100_000 and m=10  >> 1_000_000

perhaps there is a mistake somewhere ?```
on 2012-11-29 15:46
```On Thu, Nov 29, 2012 at 12:02 PM, Regis d'Aubarede
<lists@ruby-forum.com> wrote:
>>> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end
> => nil
>>> n_min([1,2,-1,3,4],2)
> => [-1, 1]

Without having measured it that's quite an inefficient way to do it
IMHO:
- you are creating a lot temporary arrays (two for each step)
- you are traversing most elements n times

I think, sorting once and then taking the first n elements is more
efficient.

Turns out, measurements confirm this: this approach is faster only for
n=1 but then one would use Enumerable#min anyway.

require 'benchmark'

REP = 1_000

puts 'Generating data...'

data = [0, 1, 10, 17].map do |i|
(1 << i).times.map { rand 1000000 }
end

puts 'done'

def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end

Benchmark.bm 30 do |x|
[1, 10, 1000].each do |n|
data.each do |array|
x.report "#{n}/#{array.size} items sort" do
REP.times do
array.sort[0, n]
end
end

x.report "#{n}/#{array.size} items n_min" do
REP.times do
n_min array, n
end
end
end
end
end

Test run (I interrupted it because it took too long):

Generating data...
done
user     system      total
real
1/1 items sort                   0.000000   0.000000   0.000000 (
0.000500)
1/1 items n_min                  0.000000   0.000000   0.000000 (
0.003000)
1/2 items sort                   0.000000   0.000000   0.000000 (
0.000500)
1/2 items n_min                  0.000000   0.000000   0.000000 (
0.002001)
1/1024 items sort                0.109000   0.000000   0.109000 (
0.112014)
1/1024 items n_min               0.093000   0.000000   0.093000 (
0.094012)
1/131072 items sort             25.709000   0.141000  25.850000 (
25.836281)
1/131072 items n_min            11.685000   0.280000  11.965000 (
11.967520)
10/1 items sort                  0.000000   0.000000   0.000000 (
0.000500)
10/1 items n_min                 0.015000   0.000000   0.015000 (
0.014002)
10/2 items sort                  0.000000   0.000000   0.000000 (
0.000000)
10/2 items n_min                 0.016000   0.000000   0.016000 (
0.014002)
10/1024 items sort               0.094000   0.016000   0.110000 (
0.112514)
10/1024 items n_min              0.904000   0.000000   0.904000 (
0.929618)
10/131072 items sort            25.850000   0.109000  25.959000 (
25.948795)
10/131072 items n_min          117.609000   2.137000 119.746000
(119.802713)
1000/1 items sort                0.000000   0.000000   0.000000 (
0.000500)
1000/1 items n_min               1.279000   0.000000   1.279000 (
1.279162)
1000/2 items sort                0.000000   0.000000   0.000000 (
0.001000)
1000/2 items n_min               1.279000   0.000000   1.279000 (
1.276662)
1000/1024 items sort             0.109000   0.000000   0.109000 (
0.112514)
1000/1024 items n_min           48.579000   0.000000  48.579000 (
48.599671)
1000/131072 items sort          25.694000   0.188000  25.882000 (
25.885787)
1000/131072 items n_min        x.rb:14:in `block in n_min': Interrupt
from x.rb:14:in `each'
from x.rb:14:in `map'
from x.rb:14:in `n_min'
from x.rb:27:in `block (5 levels) in <main>'
from x.rb:26:in `times'
from x.rb:26:in `block (4 levels) in <main>'
from /usr/lib/ruby/1.9.1/benchmark.rb:280:in `measure'
from /usr/lib/ruby/1.9.1/benchmark.rb:362:in `item'
from x.rb:25:in `block (3 levels) in <main>'
from x.rb:18:in `each'
from x.rb:18:in `block (2 levels) in <main>'
from x.rb:17:in `each'
from x.rb:17:in `block in <main>'
from /usr/lib/ruby/1.9.1/benchmark.rb:174:in `benchmark'
from /usr/lib/ruby/1.9.1/benchmark.rb:205:in `bm'
from x.rb:16:in `<main>'

Kind regards

robert```
on 2012-11-29 16:42
```> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end
>
>           array.sort[0, n]
>           n_min array, n

You compare ruby implementation for n_min() with c implementation for
sort...

here result with a qsort from http://rosettacode.org, REP=10

class Array
def qsort
return self if length <= 1
pivot = self[0]
less, greatereq = self[1..-1].partition { |x| x < pivot }
less.qsort +
[pivot] +
greatereq.qsort
end
end

Generating data...
done
user     system      total
real

4/1024 items sort              0.031000   0.000000   0.031000 (
0.029001)
4/1024 items n_min              0.000000   0.000000   0.000000 (
0.006001)

4/131072 items sort            6.022000   0.016000   6.038000 (
6.039345)
4/131072 items n_min            0.671000   0.016000   0.687000 (
0.701040)

20/1024 items sort             0.031000   0.000000   0.031000 (
0.028002)
20/1024 items n_min             0.031000   0.000000   0.031000 (
0.027001)

20/131072 items sort           6.069000   0.031000   6.100000 (
6.117349)
20/131072 items n_min           3.510000   0.078000   3.588000 (
3.595206)

100/1024 items sort            0.031000   0.000000   0.031000 (
0.030002)
100/1024 items n_min            0.125000   0.000000   0.125000 (
0.135008)

100/131072 items sort          6.286000   0.000000   6.286000 (
6.300360)
100/131072 items n_min         17.223000   0.265000  17.488000 (
17.496001)```
on 2012-11-29 18:37
```On Thu, Nov 29, 2012 at 4:42 PM, Regis d'Aubarede <lists@ruby-forum.com>
wrote:
>> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end
>>
>>           array.sort[0, n]
>>           n_min array, n
>
>
> You compar ruby implementation for n_min() with c implementation for
> sort...

Is that forbidden?  You are making use of C implementation as well.
With that argumentation of yours you would also need to reimplement
methods you use inside n_min in Ruby (notably l.min and l-[a]).  Fact
remains that your algorithm will visit most of the elements of a 2*n
times.  I find that inelegant but YMMV of course.

Kind regards

robert```
on 2012-11-29 22:15
```On 2012-11-29, Robert Klemme <shortcutter@googlemail.com> wrote:
> Is that forbidden?  You are making use of C implementation as well.
> With that argumentation of yours you would also need to reimplement
> methods you use inside n_min in Ruby (notably l.min and l-[a]).  Fact
> remains that your algorithm will visit most of the elements of a 2*n
> times.  I find that inelegant but YMMV of course.

I agree with Robert, using #sort is probably the simplest approach and
reasonably fast in most situations. This is exactly what I would do.
But I also like playing around with algorithms ;) So if you're really
keen on both, performance *and* doing as much as possible in ruby, you
can try the following implementation of "min_n", which uses a binary
heap. It has a running-time of O(n*log m) (n=array size, m=number of
small elements) and should by quite fast if m << n. Then it also seems
to be quite competetive with the sort approach.

====
def hdown(a, i)
x, n = a[i], a.size
while true do
l = 2*i + 1
r = l + 1
break if l >= n
nxt = r >= n || a[l] >= a[r] ? l : r
break if x >= a[nxt]
a[i], i = a[nxt], nxt
end
a[i] = x
end

def hbuild(a)
(a.size/2).downto(0) do |i| hdown(a, i) end
end

def min_n(a, n)
if a.size <= n then
return a.dup
else
h = a[0,n]
hbuild(h)
n.upto(a.size-1) do |i|
if h[0] > a[i] then
h[0] = a[i]
hdown(h, 0)
end
end
return h
end
end
====

The following tests are from jruby-1.7.0

1/32768 items sort               2.700000   0.020000   2.720000 (
2.685000)
1/32768 items min_n              1.400000   0.050000   1.450000 (
1.246000)
1/131072 items sort             17.070000   0.010000  17.080000 (
17.054000)
1/131072 items min_n             4.510000   0.010000   4.520000 (
4.418000)
10/32768 items sort              2.660000   0.000000   2.660000 (
2.655000)
10/32768 items min_n             1.170000   0.010000   1.180000 (
1.142000)
10/131072 items sort            17.060000   0.000000  17.060000 (
17.036000)
10/131072 items min_n            4.540000   0.010000   4.550000 (
4.437000)
100/32768 items sort             2.660000   0.000000   2.660000 (
2.656000)
100/32768 items min_n            1.440000   0.010000   1.450000 (
1.405000)
100/131072 items sort           17.070000   0.000000  17.070000 (
17.039000)
100/131072 items min_n           4.900000   0.010000   4.910000 (
4.782000)
1000/32768 items sort            2.650000   0.000000   2.650000 (
2.653000)
1000/32768 items min_n           3.850000   0.000000   3.850000 (
3.794000)
1000/131072 items sort          17.080000   0.010000  17.090000 (
17.042000)
1000/131072 items min_n          8.550000   0.010000   8.560000 (
8.320000)

and with 1.9.3

1/32768 items sort               3.420000   0.030000   3.450000 (
3.437513)
1/32768 items min_n              2.540000   0.000000   2.540000 (
2.544122)
1/131072 items sort             15.840000   0.160000  16.000000 (
16.007184)
1/131072 items min_n            10.050000   0.000000  10.050000 (
10.037374)
10/32768 items sort              3.420000   0.000000   3.420000 (
3.422531)
10/32768 items min_n             2.620000   0.000000   2.620000 (
2.616946)
10/131072 items sort            15.910000   0.000000  15.910000 (
15.913246)
10/131072 items min_n           10.250000   0.000000  10.250000 (
10.246652)
100/32768 items sort             3.410000   0.010000   3.420000 (
3.413907)
100/32768 items min_n            3.500000   0.000000   3.500000 (
3.502954)
100/131072 items sort           15.910000   0.010000  15.920000 (
15.923128)
100/131072 items min_n          11.340000   0.010000  11.350000 (
11.353193)
1000/32768 items sort            3.420000   0.000000   3.420000 (
3.417772)
1000/32768 items min_n          11.060000   0.000000  11.060000 (
11.074263)
1000/131072 items sort          15.880000   0.000000  15.880000 (
15.882260)
1000/131072 items min_n         22.170000   0.020000  22.190000 (
22.197999)

Best regards,
Frank```
on 2012-11-30 12:54
```On Thu, Nov 29, 2012 at 10:14 PM, Frank Fischer

> I agree with Robert, using #sort is probably the simplest approach and
> reasonably fast in most situations. This is exactly what I would do.
> But I also like playing around with algorithms ;)

I put a few more variants which traverse the original Enumerable just
once up at github to play around with:
https://gist.github.com/4175335

Kind regards

robert```