Hello guys,
I have been working with ranking algorithm, wich collects the smallest
numbers chronologically. but i couldn't make it work.
Here is the code.
module Enumerable
def min_n(n, &block)
block = Proc.new { |x,y| x <=> y} if block == nil
stable = StortedArray.new(&block)
each do |x|
stable << x if stable.size < n or block.call(x, stable[-1]) == -1
stable.pop until stable.size <=n
end
return stable
end
end
m = [1,5,2,8,7,9,10,100,50]
m.min_n(4)
I would be more than glad to get your feedback/suggestions of how to
make it work.
on 2012-11-27 18:04
on 2012-11-27 18:40
Hi, so what does "not work" mean? I don't know your StortedArray class (or rather SortedArray?), but if it works like I think it works, then your method should indeed output the four smallest elements: [1, 2, 5, 7] But what's all this effort for? Why not simply use Ruby's built-in methods for that? smallest = m.sort.first 4 This might not be particularly efficient, but your method isn't either.
on 2012-11-29 12:02
>> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end => nil >> n_min([1,2,-1,3,4],2) => [-1, 1] >> l=(1..100000).to_a.sort { |a,b| rand(2)-1 } >> chrono(100) { l.sort[0..3] } 14 millis >> l=(1..100000).to_a.sort { |a,b| rand(2)-1 } >> chrono(100) { n_min(l,4) } 54 millis
on 2012-11-29 13:38
2012/11/29 Regis d'Aubarede <lists@ruby-forum.com>: >>> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end > => nil >>> n_min([1,2,-1,3,4],2) > => [-1, 1] Uhh... how about ary.sort.first(n)? -- Matma Rex
on 2012-11-29 14:09
On Thu, Nov 29, 2012 at 1:37 PM, Bartosz Dziewoński <matma.rex@gmail.com> wrote: > 2012/11/29 Regis d'Aubarede <lists@ruby-forum.com>: >>>> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end >> => nil >>>> n_min([1,2,-1,3,4],2) >> => [-1, 1] > > Uhh... how about ary.sort.first(n)? You lose the relative ordering of the chosen elements. Don't know if it's important or not. Jesus.
on 2012-11-29 14:26
Try this: n = [4, 6, 7, 2, 15] def min(n) minimum = n[0] n.each do |element| if element < minimum minimum = element end end return minimum end puts min(n)
on 2012-11-29 15:00
"Иван Бишевац" <ivan.bisevac@gmail.com> wrote in post #1087106: > def min(n) > minimum = n[0] > n.each do |element| > if element < minimum > minimum = element > end > end > return minimum > end Great, you've reimplemented Array#min. Problem is, this has nothing to do with what the OP wants.
on 2012-11-29 15:32
> Uhh... how about ary.sort.first(n)?
If sort complexity is O(n*log(n), i thought that sorting the complete
array was less smart than my solution which should be
O(m*n) / m=number of minus item selected.
but it is not true:
nlog(n) with n=100_000 >> 1_115_000
m*n with n=100_000 and m=10 >> 1_000_000
perhaps there is a mistake somewhere ?
on 2012-11-29 15:46
On Thu, Nov 29, 2012 at 12:02 PM, Regis d'Aubarede <lists@ruby-forum.com> wrote: >>> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end > => nil >>> n_min([1,2,-1,3,4],2) > => [-1, 1] Without having measured it that's quite an inefficient way to do it IMHO: - you are creating a lot temporary arrays (two for each step) - you are traversing most elements n times I think, sorting once and then taking the first n elements is more efficient. Turns out, measurements confirm this: this approach is faster only for n=1 but then one would use Enumerable#min anyway. require 'benchmark' REP = 1_000 puts 'Generating data...' data = [0, 1, 10, 17].map do |i| (1 << i).times.map { rand 1000000 } end puts 'done' def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end Benchmark.bm 30 do |x| [1, 10, 1000].each do |n| data.each do |array| x.report "#{n}/#{array.size} items sort" do REP.times do array.sort[0, n] end end x.report "#{n}/#{array.size} items n_min" do REP.times do n_min array, n end end end end end Test run (I interrupted it because it took too long): Generating data... done user system total real 1/1 items sort 0.000000 0.000000 0.000000 ( 0.000500) 1/1 items n_min 0.000000 0.000000 0.000000 ( 0.003000) 1/2 items sort 0.000000 0.000000 0.000000 ( 0.000500) 1/2 items n_min 0.000000 0.000000 0.000000 ( 0.002001) 1/1024 items sort 0.109000 0.000000 0.109000 ( 0.112014) 1/1024 items n_min 0.093000 0.000000 0.093000 ( 0.094012) 1/131072 items sort 25.709000 0.141000 25.850000 ( 25.836281) 1/131072 items n_min 11.685000 0.280000 11.965000 ( 11.967520) 10/1 items sort 0.000000 0.000000 0.000000 ( 0.000500) 10/1 items n_min 0.015000 0.000000 0.015000 ( 0.014002) 10/2 items sort 0.000000 0.000000 0.000000 ( 0.000000) 10/2 items n_min 0.016000 0.000000 0.016000 ( 0.014002) 10/1024 items sort 0.094000 0.016000 0.110000 ( 0.112514) 10/1024 items n_min 0.904000 0.000000 0.904000 ( 0.929618) 10/131072 items sort 25.850000 0.109000 25.959000 ( 25.948795) 10/131072 items n_min 117.609000 2.137000 119.746000 (119.802713) 1000/1 items sort 0.000000 0.000000 0.000000 ( 0.000500) 1000/1 items n_min 1.279000 0.000000 1.279000 ( 1.279162) 1000/2 items sort 0.000000 0.000000 0.000000 ( 0.001000) 1000/2 items n_min 1.279000 0.000000 1.279000 ( 1.276662) 1000/1024 items sort 0.109000 0.000000 0.109000 ( 0.112514) 1000/1024 items n_min 48.579000 0.000000 48.579000 ( 48.599671) 1000/131072 items sort 25.694000 0.188000 25.882000 ( 25.885787) 1000/131072 items n_min x.rb:14:in `block in n_min': Interrupt from x.rb:14:in `each' from x.rb:14:in `map' from x.rb:14:in `n_min' from x.rb:27:in `block (5 levels) in <main>' from x.rb:26:in `times' from x.rb:26:in `block (4 levels) in <main>' from /usr/lib/ruby/1.9.1/benchmark.rb:280:in `measure' from /usr/lib/ruby/1.9.1/benchmark.rb:362:in `item' from x.rb:25:in `block (3 levels) in <main>' from x.rb:18:in `each' from x.rb:18:in `block (2 levels) in <main>' from x.rb:17:in `each' from x.rb:17:in `block in <main>' from /usr/lib/ruby/1.9.1/benchmark.rb:174:in `benchmark' from /usr/lib/ruby/1.9.1/benchmark.rb:205:in `bm' from x.rb:16:in `<main>' Kind regards robert
on 2012-11-29 16:42
> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end > > array.sort[0, n] > n_min array, n You compare ruby implementation for n_min() with c implementation for sort... here result with a qsort from http://rosettacode.org, REP=10 class Array def qsort return self if length <= 1 pivot = self[0] less, greatereq = self[1..-1].partition { |x| x < pivot } less.qsort + [pivot] + greatereq.qsort end end Generating data... done user system total real 4/1024 items sort 0.031000 0.000000 0.031000 ( 0.029001) 4/1024 items n_min 0.000000 0.000000 0.000000 ( 0.006001) 4/131072 items sort 6.022000 0.016000 6.038000 ( 6.039345) 4/131072 items n_min 0.671000 0.016000 0.687000 ( 0.701040) 20/1024 items sort 0.031000 0.000000 0.031000 ( 0.028002) 20/1024 items n_min 0.031000 0.000000 0.031000 ( 0.027001) 20/131072 items sort 6.069000 0.031000 6.100000 ( 6.117349) 20/131072 items n_min 3.510000 0.078000 3.588000 ( 3.595206) 100/1024 items sort 0.031000 0.000000 0.031000 ( 0.030002) 100/1024 items n_min 0.125000 0.000000 0.125000 ( 0.135008) 100/131072 items sort 6.286000 0.000000 6.286000 ( 6.300360) 100/131072 items n_min 17.223000 0.265000 17.488000 ( 17.496001)
on 2012-11-29 18:37
On Thu, Nov 29, 2012 at 4:42 PM, Regis d'Aubarede <lists@ruby-forum.com> wrote: >> def n_min(l,n) (1..n).map {a=l.min ; l=l-[a]; a } end >> >> array.sort[0, n] >> n_min array, n > > > You compar ruby implementation for n_min() with c implementation for > sort... Is that forbidden? You are making use of C implementation as well. With that argumentation of yours you would also need to reimplement methods you use inside n_min in Ruby (notably l.min and l-[a]). Fact remains that your algorithm will visit most of the elements of a 2*n times. I find that inelegant but YMMV of course. Kind regards robert
on 2012-11-29 22:15
On 2012-11-29, Robert Klemme <shortcutter@googlemail.com> wrote: > Is that forbidden? You are making use of C implementation as well. > With that argumentation of yours you would also need to reimplement > methods you use inside n_min in Ruby (notably l.min and l-[a]). Fact > remains that your algorithm will visit most of the elements of a 2*n > times. I find that inelegant but YMMV of course. I agree with Robert, using #sort is probably the simplest approach and reasonably fast in most situations. This is exactly what I would do. But I also like playing around with algorithms ;) So if you're really keen on both, performance *and* doing as much as possible in ruby, you can try the following implementation of "min_n", which uses a binary heap. It has a running-time of O(n*log m) (n=array size, m=number of small elements) and should by quite fast if m << n. Then it also seems to be quite competetive with the sort approach. ==== def hdown(a, i) x, n = a[i], a.size while true do l = 2*i + 1 r = l + 1 break if l >= n nxt = r >= n || a[l] >= a[r] ? l : r break if x >= a[nxt] a[i], i = a[nxt], nxt end a[i] = x end def hbuild(a) (a.size/2).downto(0) do |i| hdown(a, i) end end def min_n(a, n) if a.size <= n then return a.dup else h = a[0,n] hbuild(h) n.upto(a.size-1) do |i| if h[0] > a[i] then h[0] = a[i] hdown(h, 0) end end return h end end ==== The following tests are from jruby-1.7.0 1/32768 items sort 2.700000 0.020000 2.720000 ( 2.685000) 1/32768 items min_n 1.400000 0.050000 1.450000 ( 1.246000) 1/131072 items sort 17.070000 0.010000 17.080000 ( 17.054000) 1/131072 items min_n 4.510000 0.010000 4.520000 ( 4.418000) 10/32768 items sort 2.660000 0.000000 2.660000 ( 2.655000) 10/32768 items min_n 1.170000 0.010000 1.180000 ( 1.142000) 10/131072 items sort 17.060000 0.000000 17.060000 ( 17.036000) 10/131072 items min_n 4.540000 0.010000 4.550000 ( 4.437000) 100/32768 items sort 2.660000 0.000000 2.660000 ( 2.656000) 100/32768 items min_n 1.440000 0.010000 1.450000 ( 1.405000) 100/131072 items sort 17.070000 0.000000 17.070000 ( 17.039000) 100/131072 items min_n 4.900000 0.010000 4.910000 ( 4.782000) 1000/32768 items sort 2.650000 0.000000 2.650000 ( 2.653000) 1000/32768 items min_n 3.850000 0.000000 3.850000 ( 3.794000) 1000/131072 items sort 17.080000 0.010000 17.090000 ( 17.042000) 1000/131072 items min_n 8.550000 0.010000 8.560000 ( 8.320000) and with 1.9.3 1/32768 items sort 3.420000 0.030000 3.450000 ( 3.437513) 1/32768 items min_n 2.540000 0.000000 2.540000 ( 2.544122) 1/131072 items sort 15.840000 0.160000 16.000000 ( 16.007184) 1/131072 items min_n 10.050000 0.000000 10.050000 ( 10.037374) 10/32768 items sort 3.420000 0.000000 3.420000 ( 3.422531) 10/32768 items min_n 2.620000 0.000000 2.620000 ( 2.616946) 10/131072 items sort 15.910000 0.000000 15.910000 ( 15.913246) 10/131072 items min_n 10.250000 0.000000 10.250000 ( 10.246652) 100/32768 items sort 3.410000 0.010000 3.420000 ( 3.413907) 100/32768 items min_n 3.500000 0.000000 3.500000 ( 3.502954) 100/131072 items sort 15.910000 0.010000 15.920000 ( 15.923128) 100/131072 items min_n 11.340000 0.010000 11.350000 ( 11.353193) 1000/32768 items sort 3.420000 0.000000 3.420000 ( 3.417772) 1000/32768 items min_n 11.060000 0.000000 11.060000 ( 11.074263) 1000/131072 items sort 15.880000 0.000000 15.880000 ( 15.882260) 1000/131072 items min_n 22.170000 0.020000 22.190000 ( 22.197999) Best regards, Frank
on 2012-11-30 12:54
On Thu, Nov 29, 2012 at 10:14 PM, Frank Fischer <frank-fischer@shadow-soft.de> wrote: > I agree with Robert, using #sort is probably the simplest approach and > reasonably fast in most situations. This is exactly what I would do. > But I also like playing around with algorithms ;) I put a few more variants which traverse the original Enumerable just once up at github to play around with: https://gist.github.com/4175335 Kind regards robert
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.