Forum: Ruby Dice Roller (#61)

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Matthew M. (Guest)
on 2006-01-12 09:07
(Received via mailing list)
My reason for choosing a dice roller is somewhat selfish: I was
interested
to see how people would solve a problem that required parsing a
mini-language. I've written lexers and parsers before, years ago, but I
wanted to see what methods the Ruby gurus would employ.

I was not unsurprised. While there were some "traditional" solutions,
there
were also a few that made me remember something I realized in past Ruby
Quizzes: it's all about pattern matching.  One of the solutions to past
quiz
#24 (Texas Hold'Em) showed how much power can be gained by a careful
examination of the patterns in the problem; with a few carefully built
regular expressions, some gsub calls and a bit of magic, you can turn
what
looks like a complex problem into something much similar.  I should have
remembered that (or, at the least, realized that someone else would).

Discussion on the list about the quiz was rather active, most of the
time
getting into the nitty-gritty details of what d10 and 5d6d7 meant, and
occassionally joking about munchkins and their stats of all 18 (with a
strength of 18/00, of course). As solutions came in, it was nice to see
about four or five people making their first attempt. With them, this
quiz
gets the bronze medal for submission count, behind the LCD Numbers quiz
(#14) and the Numeric Maze quiz (#60).

I found unique bits in most every solution; even those that took almost
identical approaches would often, at the least, have different regular
expressions. If you are afraid of the mighty regexp, now would be a good
time to study some samples, since the syntax for the dice expressions is
reasonably simple, and many people documented their regular expressions.

Most solutions fell into one of a few types. I'll cover each a bit and
point
out anything in particular that attracted my attention.

The first class of solutions massaged the input expression a bit, then
used
Ruby's eval method to do the work. This simple solution eluded me, since
I
was so fixed on seeing parsing code. Apparently a bunch of you realized
that
(as Paul N. so nicely put) "we don't need no steenking parsers." A
few
substitutions and a helper function or two was enough to do the trick,
since
aside from the 'd' operator, the expression were valid Ruby already.
Let's
take a look at Christian N.'s solution:

    class Integer
      def d(n)
        (1..self).inject(0) { |a, e| a + rand(n) + 1 }
      end
    end

First off we have the random number generation; most solutions had this
or a
very similar variant. So the call 3.d(6) will roll and accumulate three
six-sided dice, and the expression 3.d(6) is almost a valid dice
expression.
It is in Christian's initialization method that dice expressions are
turned
into Ruby expressions:

    def initialize(dice)
       @src = dice.gsub(/d(%|00)(\D|$)/, 'd100\2').
                   gsub(/d(\d+)/, 'd(\1)').
                   gsub(/(\d+|\))d/, '\1.d').
                   gsub(/\d+/) { $&.gsub(/^0+/, '') }

       @dice = eval "lambda { #@src }"
    end

(I've left out a bit of error checking; see the full solution to see
everything.)  The first substitution fixes up percentage and double-zero
values as 100. The second turns 'd' as a binary operator into 'd' as a
method call. The third substitution provides the default dice count of 1
if it wasn't specified. And the last substitution removes leading zeroes
of
integers; this last substitution prevents the Ruby from interpreting
values
as octal.

The morphed express is saved away in a lambda, which allows Christian to
reevaluate the expression repeatedly without performing those
substitutions
every time roll is called.

There were several variations of the eval method solution, mostly in the
regular expression substitutions. A couple variations rewrote existing
operators (rather than adding new methods to Integer or Fixnum). Rather
than
change the 'd' operator into a method call, he made a slightly different
twist and rolled the dice in method_missing. Perhaps Bill didn't know
that
classes could be extended, or maybe he was hacking around for fun.
Dennis
Ranke had a eval-looking solution with no eval to be found, because the
gsub
calls actually substituted the results during parsing.  And Stefan W.
wins
for the shortest solution: three lines of hard to read Ruby code.

The second class of solutions involved using a parsing library or parser
generator tool. Three of these showed up: Pierre Barbier de Reuille used
racc, Austin Z. used syntax.rb, and David T. pulled out ANTLR to
generate his parser. Each of these will let you describe your language
in
BNF format, or something quite similar to it, and it will generate a
Ruby
language parser for you. These can be quite powerful solutions, although
readability of the language specification and the support code varies
widely. But I do recommend looking into these tools if you have need to
do
something more powerful that dice expressions; generating a complex
parser
with one of these would save much time and effort over a handcoded
solution.

And now we're at the third class of solutions: the handcrafted parsers.
About nine people handcrafted parsers, mostly recursive descent which is
rather easy to code. Because these generally involve more code, I won't
show
much, but I suggest taking a look at some of them.

Dominik B.'s solution, while it prefers to use the eval method as it
goes along rather than build a parse tree, is a easy to read solution
and
makes good use of StringScanner. Morus W.'s solution is decent to
follow
and understand and builds up an expression tree with his Expr class,
which
gets evaluated at the end with a single call to roll. My own solution is
similar, but creates a tree of proc objects.  Andrew McGuinness and
Christer
Nilsson also have recursive descent parsers that can not only roll dice,
but
calculation frequency distributions and histograms, and cheat.

I want to look now at a couple of standouts.  First, let's look at
Dennis
Ranke's second submission. It's a recursive descent parser, contained in
his
class RDParser which is included with the submission but a little too
long
to summarize here. However, what RDParser has done is allow Dennis to
write
this solution:

    parser = RDParser.new do
      token(/\s+/)
      token(/\d+/) {|m| m.to_i }
      token(/./) {|m| m }

      start :expr do
        match(:expr, '+', :term) {|a, _, b| a + b }
        match(:expr, '-', :term) {|a, _, b| a - b }
        match(:term)
      end

      rule :term do
        match(:term, '*', :dice) {|a, _, b| a * b }
        match(:term, '/', :dice) {|a, _, b| a / b }
        match(:dice)
      end

      def roll(times, sides)
        (1..times).inject(0) {|a, b| a + rand(sides) + 1 }
      end

      rule :dice do
        match(:atom, 'd', :sides) {|a, _, b| roll(a, b) }
        match('d', :sides) {|_, b| roll(1, b) }
        match(:atom)
      end

      rule :sides do
        match('%') { 100 }
        match(:atom)
      end

      rule :atom do
        match(Integer)
        match('(', :expr, ')') {|_, a, _| a }
      end
    end

Do I need to say how beautiful that is? It's short, clean, easy to read
and
understand...  and all Ruby. Package that RDParser into a module and
ship
it! I think I've found my next parsing tool...

Let's look at the guts of one more solution, by Pablo Hoch. Pablo's
solution
didn't exactly fit into the other classes of solution: it's somewhere
between a parser and an eval-uator. He decided to take the infix dice
expression and turn it into a RPN (Reverse Polish Notation, that is,
postfix) expression:

    def to_rpn(infix)
       stack, rpn, last = [], [], nil
       infix.scan(/\d+|[-+*\/()d%]/) do |token|
         case token
           when /\d+/
             rpn << token
           when '%'
             rpn << "100"
           when /[-+*\/d]/
             while stack.any? && stronger(stack.last, token)
               rpn << stack.pop
             end
             rpn << "1" unless last =~ /\d+|\)|%/
             stack << token
           when '('
             stack << token
           when ')'
             while (op = stack.pop) && (op != '(')
               rpn << op
             end
         end
         last = token
       end
       while op = stack.pop
         rpn << op
       end
       rpn
    end

I love to see code that I can understand immediately, and also recognize
how
it could be useful for other applications. This is definitely one of
those
because postfix expressions (or prefix, to be complete) are a breeze for
a
computer to evaluate:

    def roll
       stack = []
       @expr.each do |token|
         case token
           when /\d+/
             stack << token.to_i
           when /[-+*\/d]/
             b = stack.pop
             a = stack.pop
             stack << a.send(token.to_sym, b)
         end
       end
       stack.pop
    end


Thanks to everyone for the great submissions and lively discussion on
the
mailing list (and the correction to the quiz that I missed!).  I just
hope
when we all sit down to play that I don't see any of you with all
characters
stats of 18.
James G. (Guest)
on 2006-01-12 17:02
(Received via mailing list)
On Jan 12, 2006, at 1:04 AM, Matthew M. wrote:

> Thanks to everyone for the great submissions and lively discussion
> on the
> mailing list (and the correction to the quiz that I missed!).

I want to thank Matthew M. for a wonderful problem and a
sensational summary!  I also happen to know that Matthew already has
another quiz in the queue for a week from tomorrow.  How cool is that?

Matthew, you rock.

James Edward G. II
Christian N. (Guest)
on 2006-01-12 17:23
(Received via mailing list)
Matthew M. <removed_email_address@domain.invalid> writes:

>                    gsub(/\d+/) { $&.gsub(/^0+/, '') }
> as octal.
The third substitution doesn't provide a default count (Dice#d does
that),
instead, it converts 2d(6) to 2.d(6).

> The morphed express is saved away in a lambda, which allows Christian to
> reevaluate the expression repeatedly without performing those substitutions
> every time roll is called.

I could have evaluated @src each time without substituting anew, but a
lambda only makes it need to *parse* once.

Thanks for the nice writeup,
Gregory S. (Guest)
on 2006-01-12 17:57
(Received via mailing list)
On Thu, Jan 12, 2006 at 04:04:58PM +0900, Matthew M. wrote:
[...]
} Most solutions fell into one of a few types. I'll cover each a bit and
point
} out anything in particular that attracted my attention.
}
} The first class of solutions massaged the input expression a bit, then
used
} Ruby's eval method to do the work.
[...]
} The second class of solutions involved using a parsing library or
parser
} generator tool.
[...]
} And now we're at the third class of solutions: the handcrafted
parsers.
} About nine people handcrafted parsers, mostly recursive descent which
is
} rather easy to code.
[...]

I want to go through my handcrafted parser solution, which may have
gotten
missed in the shuffle. This is especially likely since I also submitted
one
using the eval method. In fact, my handcrafted parser used eval, but
only
for the +, -, /, and * operators.

The important bit is that all of the nodes in my syntax tree respond to
to_i with the computed result of its subexpression. This was largely to
make numbers behave the same way that my syntactic elements did, and
vice
versa. My parsing method looks like this (with a fail check to make sure
nothing went wrong after each line, mainly for debugging purposes) to
create the syntax tree:

1) token_array = tokenize(expression)
	uses String#scan and checks for garbage
2) token_array = validate_and_cook(token_array)
	replaces some tokens with symbols, deals with dX => 1dX, d00 =>
	d100, d% => d100, makes sure open parens have close parens and
	the operators and operands are in grammatically appropriate places
3) parse_tokens(token_array)
   a) token_array = parse_parens(token_array)
	finds parenthesized expressions and runs the subarray through
	parse_tokens; nested parens are handled recursively. Note that
	there is no need for error checking, since the expression has
	already been validated.
   b) token_array = parse_ops(token_array, /^d$/) if token_array.length
> 1
	finds all d operators and creates a DieOperator object with its
	left and right operands. Since all the operators in this grammar
	are left-associative, parse_ops is left-associative. The regex is
	used to determine if the token is the operator(s) being handled.
   c) token_array = parse_ops(token_array, /^[*\/]$/) if
token_array.length > 1
   	same as b, but for * and / operators. In this case, ArithOperator
	objects are created. The parse_ops method uses an OpsClass hash
	that matches the operator (d, *, /, +, or -) to the appropriate
	class to call new on.
   d) token_array = parse_ops(token_array, /^[-+]$/) if
token_array.length > 1
	same as c, but for + and - operators. Still using ArithOperator
	objects.

The ArithOperator class is what uses eval. It is given the operator as a
string, and its to_i method calls eval
"#{@left.to_i}#{@op}#{@right.to_i}"
to produce its value.

The points I'm trying to make are:

1) the grammar was so simple that there was no need for a proper
   recursive-descent parser
2) duck typing is way cool and let me treat numbers, DieOperators, and
   ArithOperators identically
3) a little eval let me use one class instead of four for ArithOperator
4) doing all that validation ahead of time let me replace my syntax tree
   with using eval for the whole expression without giving up my nice,
   clear failure messages

} Thanks to everyone for the great submissions and lively discussion on
the
} mailing list (and the correction to the quiz that I missed!).  I just
hope
} when we all sit down to play that I don't see any of you with all
characters
} stats of 18.

This one was loads of fun, if not as challenging as the Numeric Maze
(which
was my first). It's a pity I'll be out of town this weekend and won't be
able to do the next one.

--Greg
Matthew M. (Guest)
on 2006-01-13 00:04
(Received via mailing list)
And I can see I already that I need to read my regexps a little more
carefully.  The third regexp in Christian N.'s solution doesn't
provide the default 1 to the 'd' operator; it actually turns the 'd'
operator into a method call on Integer. And the second regexp sets up
parenthesis around the number following the 'd' as arguments for that
call.
Silly me....
Bill K. (Guest)
on 2006-01-13 00:17
(Received via mailing list)
From: "Matthew M." <removed_email_address@domain.invalid>
>
> Rather than change the 'd' operator into a method call, he made a slightly
> different twist and rolled the dice in method_missing. Perhaps Bill didn't
> know that classes could be extended, or maybe he was hacking around for fun.

Yes I could probably dominate the "cheesiest approach" category.  :)
I did extend a class in my "solution"... String#die_to_meth  :-P


Thanks for a fun quiz and an excellent summary.


Regards,

Bill
Matthew M. (Guest)
on 2006-01-13 00:29
(Received via mailing list)
Ya, you did extend a class.

Man, I must really be tired. Or forgot how to read. Or had a few brain
cells
short out. Or something. Or be really tired.
J. Ryan S. (Guest)
on 2006-01-13 21:58
(Received via mailing list)
On Jan 12, 2006, at 10:22 AM, Christian N. wrote:

> I could have evaluated @src each time without substituting anew, but a
> lambda only makes it need to *parse* once.

Hey Christian,

I didn't have time to write my own solution to quiz #61, so I'm going
over yours, which is similar to what I was kicking around in my
head.  Unfortunately, I confused by your eval statement, which is
wrapped over a new Proc object.  Instead, why not wrap the lambda
around the eval like so?

@dice = lambda { eval @src }

The both _seem_ to produce the same output, yet conceptually, my
approach makes more sense to me.  Perhaps I'm missing something
crucial in my statement like...

Does eval parse @src for each call() message?  Or does the parsing
happen on initialization of the Proc (like I would expect)?  I
suppose I could trace the execution using the debugger, but I'm
unaware how to step inside a Kernel method like eval or lambda with it.

~ ryan ~
James G. (Guest)
on 2006-01-13 22:05
(Received via mailing list)
On Jan 13, 2006, at 1:57 PM, J. Ryan S. wrote:

>>> every time roll is called.
> is wrapped over a new Proc object.  Instead, why not wrap the
> lambda around the eval like so?
>
> @dice = lambda { eval @src }
>
> The both _seem_ to produce the same output, yet conceptually, my
> approach makes more sense to me.  Perhaps I'm missing something
> crucial in my statement like...
>
> Does eval parse @src for each call() message?

Yes.  You pass lambda() a block of code that is executed each time
the code it called.

James Edward G. II
Logan C. (Guest)
on 2006-01-13 22:11
(Received via mailing list)
On Jan 13, 2006, at 2:57 PM, J. Ryan S. wrote:

>>> every time roll is called.
> is wrapped over a new Proc object.  Instead, why not wrap the
> suppose I could trace the execution using the debugger, but I'm
> unaware how to step inside a Kernel method like eval or lambda with
> it.
>
> ~ ryan ~
>

@dice.call has eval parse @src everytime (@src may have changed). By
doing something like eval "lambda { #{@src} }" you parse @src once
and get back a lambda.
J. Ryan S. (Guest)
on 2006-01-13 22:17
(Received via mailing list)
On Jan 13, 2006, at 3:03 PM, James Edward G. II wrote:

>> Does eval parse @src for each call() message?
>
> Yes.  You pass lambda() a block of code that is executed each time
> the code it called.

Executed yes.  But when is it parsed / checked for valid grammar,
syntax, etc.?  Did you mean to say that the block of code is parsed
AND executed for each call() it receives?

~ ryan ~
James G. (Guest)
on 2006-01-13 22:23
(Received via mailing list)
On Jan 13, 2006, at 2:15 PM, J. Ryan S. wrote:

> On Jan 13, 2006, at 3:03 PM, James Edward G. II wrote:
>
>>> Does eval parse @src for each call() message?
>>
>> Yes.  You pass lambda() a block of code that is executed each time
>> the code it called.
>
> Executed yes.  But when is it parsed / checked for valid grammar,
> syntax, etc.?  Did you mean to say that the block of code is parsed
> AND executed for each call() it receives?

lambda() doesn't imply that, but eval() does.  It must be reparsed
with each call.

James Edward G. II
J. Ryan S. (Guest)
on 2006-01-13 22:32
(Received via mailing list)
On Jan 13, 2006, at 3:15 PM, J. Ryan S. wrote:

>
> ~ ryan ~

obj = lambda { eval "#{2**10}" }

(1..1_000_000).each do
   obj.call
end

$ time ruby test.rb

real    0m26.046s
user    0m24.377s
sys     0m0.358s

-------------------------------------------

obj = eval "lambda { #{2**10} }"

(1..1_000_000).each do
   obj.call
end

$ time ruby test.rb

real    0m3.132s
user    0m2.895s
sys     0m0.047s

What a difference!  I'll agree that parsing of eval's string happens
on every call() message in the first example.  :)

~ ryan ~
This topic is locked and can not be replied to.