Not So Random (#173)

mansfiem · August 15, 2008, 5:37pm

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

The three rules of Ruby Q. 2:

Please do not post any solutions or spoiler discussion for this
quiz until 48 hours have passed from the time on this message.
Support Ruby Q. 2 by submitting ideas as often as you can! (A
permanent, new website is in the works for Ruby Q. 2. Until then,
please visit the temporary website at

http://splatbang.com/rubyquiz/.
Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem
helps everyone on Ruby T. follow the discussion. Please reply to
the original quiz message, if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Not So Random (#173)

As part of Ruby’s standard library, we have access to a good
pseudorandom number generator (PRNG), the Mersenne twister. (In
particular, the MT19937 variant.) Ruby provides two kernel functions
to make use of this generator: srand and rand. This week’s quiz
involves generating random numbers in a predictable manner.

The first part is to create a wrapper around the random number
generator rand, supporting the appropriate parameter (i.e. the
parameter intrinsic to rand). Your wrapper should implement two
additional functions: next which returns the next random number, and
reset which restarts the sequence. So, for example:

irb> x = Random.new(100)
=> #<Random:...>

irb> Array.new(5) { x.next }
=> [51, 92, 14, 71, 60]

irb> Array.new(5) { x.next }
=> [20, 82, 86, 74, 74]

irb> x.reset
=> nil

irb> Array.new(5) { x.next }
=> [51, 92, 14, 71, 60]     # after reset, sequence restarts

You may do this as a class, as depicted here, or as a function,
lambda, code block… whatever you like.

The second part is a little trickier: creating multiple, concurrent,
reproducible sequences of pseudorandom numbers isn’t so easy. As
convenient as the built-in generator is, there is only one seed. For
example, assume we have two Random objects, created as shown above,
x and y.

irb> x = Random.new(100)
=> #<Random:...>

irb> Array.new(6) { x.next }
=> [51, 92, 14, 71, 60, 20]

irb> y = Random.new(100)
=> #<random:...>

irb> Array.new(6) { y.next }
=> [66, 92, 98, 17, 83, 57]

irb> x.reset
=> nil

irb> Array.new(2) { x.next }
=> [51, 92]       # ok: sequence restarted as requested

irb> Array.new(2) { y.next }
=> [14, 71]       # fail: this is part of _x_ sequence

irb> Array.new(2) { x.next }
=> [60, 20]       # more fail: _x_ is now [51, 92, 60, 20]? wrong...

The reason for the failure should be obvious: my current
implementation of Random just blindly uses srand and rand
without considering that there may be multiple instances of Random.

So, for this second part, expand your wrapper to support concurrent
use. Please note that you are required to make use of the built-in
generator: you should not implement your own PRNG.

One final note… It is up to you whether the seed for each wrapper
will be user-settable or hidden (as in my examples above). However, if
hidden, each wrapper should have a different seed. (Generated
randomly, perhaps?)

mansfiem · August 15, 2008, 7:16pm

On 15 Aug 2008, at 16:33, Matthew M. wrote:

<http://splatbang.com/rubyquiz/>.
Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem
helps everyone on Ruby T. follow the discussion. Please reply to
the original quiz message, if you can.

Great to have the quiz back! This one looks fun.

Fred

mansfiem · August 15, 2008, 7:23pm

On Fri, Aug 15, 2008 at 11:33 AM, Matthew M. [email protected]
wrote:

convenient as the built-in generator is, there is only one seed.
A couple of unit tests:

#!/usr/bin/env ruby

class Random
def initialize(ceiling, seed = nil)
#…
end

def next
#…
end

def reset
#…
end
end

require ‘test/unit’

class RandomTest < Test::Unit::TestCase
def test_001
x = Random.new(100, 2112)
assert_equal( [38, 1, 5, 57, 11], Array.new(5) { x.next })
assert_equal( [ 1, 31, 3, 56, 14], Array.new(5) { x.next })
x.reset
assert_equal( [38, 1, 5, 57, 11], Array.new(5) { x.next })
end

def test_002
x = Random.new(100, 2112)
assert_equal( [38, 1, 5, 57, 11], Array.new(5) { x.next })
y = Random.new(100, 1221)
assert_equal( [ 5, 99, 88, 48, 86], Array.new(5) { y.next })
x.reset
assert_equal( [38, 1, 5, 57, 11], Array.new(5) { x.next })
assert_equal( [34, 28, 72, 77, 87], Array.new(5) { y.next })
assert_equal( [ 1, 31, 3, 56, 14], Array.new(5) { x.next })
end
end

Loaded suite foo
Started
…
Finished in 0.251163 seconds.

2 tests, 8 assertions, 0 failures, 0 errors

mansfiem · August 16, 2008, 1:54am

The first part is to create a wrapper around the random number
generator rand, supporting the appropriate parameter (i.e. the
parameter intrinsic to rand). Your wrapper should implement two
additional functions: next which returns the next random number, and
reset which restarts the sequence. So, for example:
…
You may do this as a class, as depicted here, or as a function,
lambda, code block… whatever you like.

Let me add a bit of additional info to this… I indicated that you
should implement methods next and reset on your wrapper. This is
not a hard requirement.

You do need to provide a way to access a particular generator’s
(i.e. wrapper’s) next number, and provide a way to reset the sequence.
For a class wrapper, this is somewhat natural to do as methods next
and reset, hence my suggestion.

However, if you find want to use another technique (function, code
block, closure, etc.) is more suitable for your solution, please do…
and if that means providing alternate means (other than next/reset
methods) to access the next random number and reset the sequence,
that’s fine. Just state in your submission how you provided for next/
reset equivalents.

mansfiem · August 16, 2008, 2:19am

I know that a Mersenne Twister has a large internal state, so it takes
time to “get started” and produce good quality random numbers from a
non-random initial state. If it gets frequently reseeded, either the
seeding procedure refreshes all the state and runs it for a bit (i.e.
takes a lot of time) or the quality will greatly degrade, I guess. Is
this a problem?

mansfiem · August 16, 2008, 5:04pm

My solution seems to work:

irb(main):022:0> x = Random.new(100, 1234)
=> #<Random: …>
irb(main):023:0> Array.new(6) { x.next }
=> [47, 83, 38, 53, 76, 24]
irb(main):024:0> y = Random.new(100, 2345)
=> #<Random: …>
irb(main):025:0> Array.new(6) { y.next }
=> [80, 7, 96, 16, 2, 96]
irb(main):026:0> x.reset
=> 2345
irb(main):027:0> Array.new(2) { x.next }
=> [47, 83]
irb(main):028:0> Array.new(2) { y.next }
=> [0, 98]
irb(main):029:0> Array.new(2) { x.next }
=> [38, 53]

It’s not the most beatiful way, but it works.

mansfiem · August 16, 2008, 3:19am

On Aug 15, 7:16 pm, Martin B. [email protected] wrote:

I know that a Mersenne Twister has a large internal state, so it takes
time to “get started” and produce good quality random numbers from a
non-random initial state. If it gets frequently reseeded, either the
seeding procedure refreshes all the state and runs it for a bit (i.e.
takes a lot of time) or the quality will greatly degrade, I guess. Is
this a problem?

For the purposes of this quiz, no.

mansfiem · August 17, 2008, 5:39pm

On 15 Aug 2008, at 16:33, Matthew M. wrote:

Not So Random (#173)

A crude solution: we remember the seed and how many times we’ve been
called. When called for the n+1 time, we reset the seed, call rand n
times and then return the result of the n+1th call.
As far as the selection of the original seed goes, srand can already
cook one up for us so we lean on that.

class Random
def initialize(*rand_args)
@rand_args = rand_args
ignored, @start_seed = srand, srand
@sequence = 0
end

def next
srand(@start_seed)
@sequence.times { rand *@rand_args}
@sequence += 1
rand *@rand_args
end

def reset
@sequence = 0
end
end

Performance wise this isn’t great - generating n random numbers
require n(n+1)/2 calls to rand. Things can be improved by generating
(say) 100 numbers in one go, since what is expensive is recovering our
previous state.

This sample achieves that

class Random
def initialize(*rand_args)
@rand_args = rand_args
ignored, @start_seed = srand, srand
@sequence = 0
@cache = []
end

def next
populate if @cache.empty?
@cache.shift
end

def populate
srand(@start_seed)
@sequence.times { rand @rand_args}
100.times do
@cache << rand(@rand_args)
end
@sequence += 100
end

def reset
@cache = []
@sequence = 0
end
end

It’s a lot faster (0.25s versus 22.3 s to generate 10000 numbers) but
the complexity isn’t any better.

Fred

mansfiem · August 17, 2008, 6:06pm

On Fri, Aug 15, 2008 at 11:33 AM, Matthew M. [email protected]
wrote:

convenient as the built-in generator is, there is only one seed.
Since a single Ruby process only has one PRNG, slave off an extra
process for each Random class:

#!/usr/bin/env ruby

require ‘rubygems’
require ‘slave’

class Random
def initialize(ceiling, seed = nil)
@ceiling = ceiling
@seed = seed || rand(2112)
@slave = Slave::new{
lambda {|reset|
if reset
srand(@seed)
else
rand(@ceiling)
end
}
}
self.reset
end

def next
@slave.object.call(false)
end

def reset
@slave.object.call(true)
end
end

require ‘test/unit’

class RandomTest < Test::Unit::TestCase
def test_001
x = Random.new(100, 2112)
assert_equal( [38, 1, 5, 57, 11], Array.new(5) { x.next })
assert_equal( [ 1, 31, 3, 56, 14], Array.new(5) { x.next })
x.reset
assert_equal( [38, 1, 5, 57, 11], Array.new(5) { x.next })
end

def test_002
x = Random.new(100, 2112)
assert_equal( [38, 1, 5, 57, 11], Array.new(5) { x.next })
y = Random.new(100, 1221)
assert_equal( [ 5, 99, 88, 48, 86], Array.new(5) { y.next })
x.reset
assert_equal( [38, 1, 5, 57, 11], Array.new(5) { x.next })
assert_equal( [34, 28, 72, 77, 87], Array.new(5) { y.next })
assert_equal( [ 1, 31, 3, 56, 14], Array.new(5) { x.next })
end
end

Loaded suite foo
Started
…
Finished in 0.251163 seconds.

2 tests, 8 assertions, 0 failures, 0 errors

mansfiem · August 17, 2008, 6:00pm

My first solution takes a similar approach as FC first one but with
some caching. The sequence is only regenerated when the context has
changed:

class Random
@@context = nil

def initialize(ceiling, seed = nil)
    @seed = seed || srand(srand)
    @rand = Hash.new do |h,k|
        unless @@context == self
            srand(@seed)
            1.upto(@index - 1) {rand(ceiling)}
            @@context = self
        end
        h[k] = rand(ceiling)
    end
    reset
end

def next
    @rand[@index += 1]
end

def reset
    @index = 0
end

end

I haven’t had the time to give it too much thought so I’m not sure if
the following actually makes sense. The idea is to seed the random
generator for the next random number with an initial seed + object-
specific sequence number. The sequence is deterministic and mostly
random but differs from the default sequence.

class Random
def initialize(ceiling, seed = nil)
@seed = seed || (srand; rand(ceiling))
@rand = Hash.new do |h,k|
srand(@seed + k)
h[k] = rand(ceiling)
end
reset
end

def next
    @rand[@index += 1]
end

def reset
    @index = 0
end

end

mansfiem · August 17, 2008, 8:32pm

Very nice, I like…

mansfiem · August 17, 2008, 9:14pm

From: [email protected]

Since a single Ruby process only has one PRNG, slave off an extra
process for each Random class:

#!/usr/bin/env ruby

require ‘rubygems’
require ‘slave’

Nifty. I hadn’t seen the slave library before. Is it a wrapper
around fork? gem --list doesn’t find any description:

slave (1.2.1, 1.2.0, 1.1.0, 1.0.0, 0.2.0, 0.0.1, 0.0.0)
slave

Anyway… I took a similar approach, but using popen. I meant it
to be a somewhat tongue-in-cheek solution… but it does work.


class Random

  def initialize(ceiling=0, seed=nil)
    @ceiling = ceiling
    @io = nil
    @seed = seed || (rand(0xFFFFFFFF) + 1)
  end

  def next
    restart_pipe unless @io
    n = @io.gets.chomp
    if n.index(".")
      Float(n)
    else
      Integer(n)
    end
  end

  def reset
    kill_pipe
  end

  protected

  def restart_pipe
    kill_pipe
    @io = IO.popen(%{ruby -e "srand(#@seed); loop{puts 
rand(#@ceiling)}"}, "r")
  end

  def kill_pipe
    if @io
      Process.kill("KILL", @io.pid)
      @io = nil
    end
  end
end


if $0 == __FILE__

# tests from brabuhr-at-gmail.com

require 'test/unit'

class RandomTest < Test::Unit::TestCase
  def test_001
    x = Random.new(100, 2112)
    assert_equal( [38,  1,  5, 57, 11], Array.new(5) { x.next })
    assert_equal( [ 1, 31,  3, 56, 14], Array.new(5) { x.next })
    x.reset
    assert_equal( [38,  1,  5, 57, 11], Array.new(5) { x.next })
    x.reset
  end

  def test_002
    x = Random.new(100, 2112)
    assert_equal( [38,  1,  5, 57, 11], Array.new(5) { x.next })
    y = Random.new(100, 1221)
    assert_equal( [ 5, 99, 88, 48, 86], Array.new(5) { y.next })
    x.reset
    assert_equal( [38,  1,  5, 57, 11], Array.new(5) { x.next })
    assert_equal( [34, 28, 72, 77, 87], Array.new(5) { y.next })
    assert_equal( [ 1, 31,  3, 56, 14], Array.new(5) { x.next })
    x.reset
    y.reset
  end
end

end

Regards,

Bill

mansfiem · August 17, 2008, 9:28pm

On Sun, Aug 17, 2008 at 3:10 PM, Bill K. [email protected] wrote:

From: [email protected]

require ‘rubygems’
require ‘slave’

Nifty. I hadn’t seen the slave library before. Is it a wrapper
around fork? gem --list doesn’t find any description:

Fork and Drb:

http://codeforpeople.com/lib/ruby/slave/slave-1.2.1/README

mansfiem · August 17, 2008, 10:54pm

From: “Martin B.” [email protected]

Does Ruby use such a dreadful function for its internal hash tables? If
so, filling a hash with previously ordered data is going to (over)fill
the hash buckets one by one… @___@

I’m not sure why that would follow. The number of buckets is typically
a prime number, and the hash value is mapped to a bucket number by
modulo arithmetic. So . . . .

num_buckets = 7
=> 7
“a”.upto(“z”) {|x| puts “#{x} hash=#{x.hash} → bucket=#{x.hash % num_buckets}” }
a hash=100 → bucket=2
b hash=101 → bucket=3
c hash=102 → bucket=4
d hash=103 → bucket=5
e hash=104 → bucket=6
f hash=105 → bucket=0
g hash=106 → bucket=1
h hash=107 → bucket=2
i hash=108 → bucket=3
j hash=109 → bucket=4
k hash=110 → bucket=5
l hash=111 → bucket=6
m hash=112 → bucket=0
n hash=113 → bucket=1
o hash=114 → bucket=2
p hash=115 → bucket=3
q hash=116 → bucket=4
r hash=117 → bucket=5
s hash=118 → bucket=6
t hash=119 → bucket=0
u hash=120 → bucket=1
v hash=121 → bucket=2
w hash=122 → bucket=3
x hash=123 → bucket=4
y hash=124 → bucket=5
z hash=125 → bucket=6

I’m far from an expert on such matters… but it’s not clear to me why
sequential hash values would be a problem for this application?

Regards,

Bill

mansfiem · August 17, 2008, 9:58pm

Hallo everybody. Please go easy on me, this quiz solution is my third
ruby program This language is a new discovery and it fascinates me,
but I learnt to program in the old-fashioned imperative world in high
school and I’m sure that it shows pretty clearly in the code (I made a
couple of “shout-outs” in the code to all fellow former users of the one
and only Borland Turbo Pascal 7 compiler

I decided to enter the “contest” hoping that somebody will show how
these “translated Pascal” constructs of mine can be rephrased in a “more
Ruby” way, maybe Then, along the way, i discovered a really strange
thing that I’d like to discuss with you.

The first class adheres to the quiz specifications, but I will be frank,
anyway: I cheated. The class simply stores the already generated
numbers… after all, I tought, the tradeoff beween a few kilobytes of
memory and the millions of cycles and butchering of mathematics caused
by continuously reseeding the main PRNG should be acceptable. If
somebody wants to pull hundreds of thousands of numbers in different
sequences from such a thing, he is doing the wrong thing anyway…

class Rseq

@@Instances = 0
@@Previous = 0
@@Randomize = rand 2**29
@@Randomize.freeze

def getID
  @@Instances = @@Instances.succ
  return ( @@Instances * 43589 + @@Randomize )
end

def initialize(arg_to_rand=0)
  @ID = getID
  @ID.freeze
  @arg = arg_to_rand
  @@Previous = @ID
  srand(@ID)
  @cache = [ rand( @arg ) ]
  @pos = 0
end

def next
  @pos = @pos.succ
  if @pos <= @cache.length
    return @cache[(@pos-1)]
  else
    if @@Previous != @ID
      @@Previous = @ID
      srand ( @ID + @pos )
      end
    @cache.push rand(@arg)
    return @cache.last
  end
end

def reset
  @pos = 0
end

end

A better way to answer the basic quiz specifications for arbitrarily
large outputs could be to hash a counter, I realized, so I wrote another
class that works that way. Just for fun. A literal counter should work
fine with a good hash function (maybe with some feedback), but I decided
to use a string as “counter” because I believed that the Ruby hash
function was optimized for those.

class HRand

MAXINT = (2**(1.size*8 - 2) - 1).to_f

def initialize(arg_to_rand=0)
  @arg = arg_to_rand.to_i
  if @arg.is_a? Bignum
    @FIXNUM = false
    bytes = @arg.size
    @D = (0x7F<< ((bytes-1)*8))
    (bytes-2).times {|k| @D = (@D && (0xFF << (k*8))) }
    else
    @FIXNUM = true
  end
  @FIXNUM.freeze

  @s = ''
  10.times { @s = @s + rand(255).chr }
  @s.freeze
  @index = 0
end

def next
@index = @index.succ
if @FIXNUM
  h = (@s + @index.to_s).hash.abs
  return (h / MAXINT * @arg).to_i
  else
  num = 0
  words = @arg.size/1.size
  words.times {|k| n = n + ((@s + k.chr + @index.to_s).hash.abs <<

k1.size8) }
num[bytes*8-1] = 0
fraction = Rational(n,@D)
return (@arg * fraction).to_i
end
end

def reset
@index = 0
end

end

And it doesn’t work.
Not surprising, in itself.
But it’s totally unexpected and surprising that the fault, as far as I
can tell, doesn’t lie in my code but in core Ruby o___O
The HRand objects output long runs of the same numbers because the
string.hash method returns SEQUENTIAL numbers for alphabetically
sequential strings!

I was taught that a hash function should strive to distribute the inputs
in its keyspace with maximal randomness, and that it should mask any
normally possible input pattern (those that are not a knowledgeable
attack at the function, I mean)… That’s not a hash function, that’s
a checksum!

Does Ruby use such a dreadful function for its internal hash tables? If
so, filling a hash with previously ordered data is going to (over)fill
the hash buckets one by one… @___@

mansfiem · August 17, 2008, 11:11pm

Bill K. wrote:

I’m not sure why that would follow. The number of buckets is typically
a prime number, and the hash value is mapped to a bucket number by
modulo arithmetic. So . . . .

Oh, right. You usually reduce the output to the size of the table with
the modulus, obviously… anyway, I used to think that the output of an
hash function should look random to anybody who isn’t actively trying to
break it (the output of a cryptography-grade hash should look random
EVEN IF YOU TRY, obviously :D)

mansfiem · August 18, 2008, 12:17am

[email protected] wrote:

reproducible sequences of pseudorandom numbers isn’t so easy. As
class Random
}
end
assert_equal( [38, 1, 5, 57, 11], Array.new(5) { x.next })
assert_equal( [ 1, 31, 3, 56, 14], Array.new(5) { x.next })

This is probably the cleanest solution you can get without implementing
your own PRNG or mucking around in C code to copy the internal state of
the PRNG. As long as it uses fork, it shouldn’t eat up RAM or anything.
The popen solution probably would if a large number of Random objects
were created.

mansfiem · August 18, 2008, 12:12am

Matthew M. wrote:

Not So Random (#173)

additional functions: next which returns the next random number, and

reproducible sequences of pseudorandom numbers isn’t so easy. As
irb> y = Random.new(100)

Without knowing how the PRNG is implemented (even if in C or Ruby), this
solution solves the problem. It’ll be slow if you’re switching between
generators deep in the sequence, but it works. It also accounts for
Kernel#rand and Kernel#srand screwing up your Random objects. It passes
the tests posted by [email protected].

module Kernel
alias_method :old_rand, :rand
alias_method :old_srand, :srand

def rand(*a)
Random.dirty
old_rand(*a)
end

def srand(*a)
Random.dirty
old_srand(*a)
end
end

class Random
@@seed = nil

def self.dirty
@@seed = nil
end

def initialize(range, seed = Time.now.to_i)
@range = range
@seed = seed
reset
end

def reset
@sequence = 0
reinitialize
end

def reinitialize
old_srand(@seed)
@@seed = @seed
@sequence.times { old_rand(@range) }
end

def next
reinitialize if @@seed.nil? or @@seed != @seed
@sequence += 1
old_rand(@range)
end
end

mansfiem · August 18, 2008, 2:22am

Here’s a “cheater” solution. The sequences are reproducible
via the #reset / #next API, for as long as a given instance of the
Random object remains in existence. However, independent instances
of the Random object, even if started with the same seed, won’t
produce identical sequences.

Probably too stupid to post, but oh well…



class Random
  def initialize(ceiling=0, seed=nil)
    @ceiling = ceiling
    @seq = []
    @idx = 0
    srand(seed || (rand(0xFFFFFFFF) + 1))
  end

  def next
    unless @idx < @seq.length
      @seq.push( rand(@ceiling) )
      @idx = @seq.length - 1
    end

    n = @seq[@idx]
    @idx += 1
    n
  end

  def reset
    @idx = 0
  end
end


if $0 == __FILE__

require 'test/unit'

class RandomTest < Test::Unit::TestCase
  def test_001
    x = Random.new(100)
    y = Random.new(1000)

    xseq = Array.new(5) { x.next }
    yseq = Array.new(5) { y.next }

    xseq += Array.new(5) { x.next }
    yseq += Array.new(5) { y.next }

    x.reset
    assert_equal( xseq, Array.new(10) { x.next } )

    y.reset
    assert_equal( yseq, Array.new(10) { y.next } )

    xseq += Array.new(5) { x.next }
    yseq += Array.new(5) { y.next }

    x.reset
    assert_equal( xseq, Array.new(15) { x.next } )

    y.reset
    assert_equal( yseq, Array.new(15) { y.next } )
  end
end

end

Regards,

Bill

mansfiem · August 18, 2008, 12:15pm

My solution:

class Random
def initialize(n, seed = rand(232))
@n = n
@seed = @iseed = seed
end
def next
Thread.critical = true
srand(@seed)
@seed = @seed ^ rand(232)
val = rand(@n)
Thread.critical = false
return val
end
def reset
@seed = @iseed
end
end