Apologies for the late summary… I’m moving to a new place today,
tomorrow, this weekend… and start classes on Monday. Also, look for
the next quiz to show up this Saturday, not tomorrow.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Procedural generation is a valuable technique for computer
graphics and games that relies on algorithms to produce content on the
fly. Use of procedural techniques have increased in recent years, due
to the large volume of content that must be created… too large or
unwieldy to be completed by hand in a short duration. But procedural
generation is not new. Early computer games such as Elite and
Rescue on Fractalus used algorithms, rather than preformed data, to
represent large worlds on machines with limited storage.
On one hand, CPU cycles are traded for storage limitations. On the
other, CPU cycles are exchanged for human and schedule limitations.
Essentially, procedural generation is a way to swap CPU time for other
limited resources. But if the same algorithms are run repeatedly,
won’t everything look the same? An algorithm, given the same input, by
definition produces the same output.
That’s where pseudorandom numbers can help, both to provide variety
and maintain repeatability. A random sequence starting with a seed
value of 42 will – when put through a particular algorithm – always
generate the same output. Yet, start the sequence with seed value 43,
and the same algorithm will generate new output. We can generate a
forest of randomly generated trees, each looking similar but unique,
and not have the trees randomly change every frame.
Do we need more than one generator? It depends on your algorithms and
how you generate the content. It could work, but it may be easier to
keep some things separate. One random number generator for the forest.
Another generator for the city buildings. Another one for people. That
way, I can avoid generating information for the forest if it happens
to go offscreen, because the PRNGs for the buildings and the people
are independent of the forest PRNG.
Generally, for task such as this, you don’t usually need a generator
such as the Mersenne Twister. A linear congruential generator,
while not really very random, is quite simple, fast and often
sufficient. For such a task, Ruby code of a few lines would have been
completely sufficient.
But I wanted to see how we could convince Ruby’s rand
to do what we
wanted, without writing a new PRNG. I’m glad I asked, because after
playing with closures and continuations, I conceived only one solution
myself… I knew there had to be another, better way.
But first, let’s look at the first solution from Frederick C.,
which implements the only idea I had.
class Random
def initialize(*rand_args)
@rand_args = rand_args
ignored, @start_seed = srand, srand
@sequence = 0
end
def next
srand(@start_seed)
@sequence.times { rand *@rand_args}
@sequence += 1
rand *@rand_args
end
def reset
@sequence = 0
end
end
At first, I thought Frederick was the only one who implemented a
solution that would allow rand
to produce random floats between zero
and one (i.e. what you get when you call rand
with no parameters). I
consider this important behavior. However, it turns out that passing
zero to rand
induces the same behavior, so props to everyone who
provided a default value of zero.
This implementation is straightforward. Each time we call next
to
get the next random number in the sequence, we restart the sequence,
calling rand
a number of times equal to @sequence
, which is
incremented each call. This is very simple and guarantees we always
get back to the correct point, whether or not other Random
objects
have generated numbers.
(Note that while I did mention concurrent in the quiz description, I
did not mean to imply multithreaded, so for our purposes here, this is
sufficient.)
I found this line in the initializer a little curious:
ignored, @start_seed = srand, srand
srand
returns the previous seed. Calling it twice in a row will
first set the seed, then return that and set it again to a new value.
But the saved seed is what is used in calls to next
. It’s a nice
trick to get the (time-based) seed generated by srand
.
While simple, this solution is highly inefficient. Procedural
generation can require a lot of random numbers, and this PRNG
algorithm is quadratic. (That is, O(n^2)
.) Very quickly, this
generator would become painfully slow.
One possible improvement, provided in a few solutions, is caching.
Since we’re generating the same exact data, we can save our work and
reuse it later. In our contrived situation, This isn’t as bad as it
sounds. Although memory use must be considered, it’s not unlikely that
a particular Random
object – a particular sequence – will always
generate the same quantity of numbers every time. If we’re not
uprooting trees in that forest, it’s likely we need to continue
generating the same number of trees. Of course, in typical situations,
you wouldn’t force yourself into this quiz’s confines, but if a small
cache and a one-time generation is sufficient, you could trade a
little memory for CPU time, if need be.
To finish up, let’s look at brabuhr’s clean, compact solution.
require 'slave'
class Random
def initialize(ceiling, seed = nil)
@ceiling = ceiling
@seed = seed || rand(2112)
@slave = Slave::new{
lambda {|reset|
if reset
srand(@seed)
else
rand(@ceiling)
end
}
}
self.reset
end
def next
@slave.object.call(false)
end
def reset
@slave.object.call(true)
end
end
The Slave library combines a forked Ruby process with DRb
(Distributed Ruby) and creates a client-server relationship between
the two processes. This relationship is created in the call to
Slave::new
, which passes the following block to the slave:
lambda { |reset|
if reset
srand(@seed)
else
rand(@ceiling)
end
}
Calling this block with a boolean (as done in the next
and reset
methods) simply invokes either srand
or rand
. But because the
slave object is a forked Ruby process, it contains a separate built-in
seed for those methods. Each Random
instance will create such a
slave, thereby creating independent seeds that cannot interfere with
one another. In my own experience, I am so used to threading (i.e.
shared state and address space), the idea of using a forked process
(i.e. independent state and address space) didn’t cross my mind.
Using Slave may have some overhead from the interprocess
communication. However, it is a constant-time operation: it should
easily beat the first Random
implementation for performance for
typical use.