On Sun, Dec 04, 2005 at 10:21:02AM +0900, [email protected] wrote:
On Sun, 4 Dec 2005, Mauricio [iso-8859-1] Fernández wrote:
sort_by{ rand } is actually biased, since sort_by will preserve the
relative order of the elements for which rand() returned the same value.
I did some numbers and found the following probability estimates:
array size P(#rand() collision)
1000 5.54558e-11
10000 5.55056e-09
1000000 5.55096e-05
10000000 5.53574e-03
A collision implies that the associated pair of elements has got the
same
ordering in the output array, instead of 50-50 chances of it being
reversed.
More info, including the Ruby code I used, available at
eigenclass.org
does
%w( a b c d ).sort{ rand <=> rand }
help?
I’m not totally sure it’s really unbiased, but at first sight it looks
good.
When called with no arguments, rand will use
static double
genrand_real(void)
{
unsigned long a=genrand_int32()>>5, b=genrand_int32()>>6;
return(a67108864.0+b)(1.0/9007199254740992.0);
}
so it would seem that two consecutive calls to rand() cannot return the
same value, but I still have to think about the effect of qsort.
It is however somewhat slower than sort_by{ rand }.
[100, 1000, 10000, 100000].each do |n|
GC.start
t0 = Time.new
(1…n).sort_by{rand}
a = (t1 = Time.new) - t0 # => 0.000425, 0.003002,
0.054392, 0.939692
(1…n).sort{rand <=> rand}
b = Time.new - t1 # => 0.001161, 0.026275,
0.289807, 3.791833
“%4.2f” % [b / a] # => “2.73”, “8.75”, “5.33”,
“4.04”
end
RUBY_VERSION # => “1.8.3”
RUBY_RELEASE_DATE # => “2005-09-24”