Forum: Ruby Infix to Postfix Method

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Peter M. (Guest)
on 2007-04-29 03:34
I'm still getting to grips with Ruby, but I think I'm doing well. My
last post generated a lot of positive comments/advice which really
helped me, so I thought I'd post some more of my code to try and repeat
that!

The following is a method which will convert an infix expression (1+1)
to a postfix expression (1 1 +). It expects a string and returns an
array. Once again I'd really appreciate your comments.

def postfix(expression)

  precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}

  output = Array.new
  stack = Array.new
  temp = ''

  expression.each_byte do |asc|

    case asc
      when 41
        output.push(temp)
        temp = ''
        loop{
          if stack.empty? == false
            popped = stack.pop
            if popped == 40
              break
            else
              output.push(popped.chr)
            end
          else
            raise 'Missing "("!'
          end
        }
      when 40,42, 43, 45, 47 #*+-/
        output.push(temp)
        temp = ''
        unless asc == 40
          p_asc = precedence[asc]
          loop{
            if  (p_asc > precedence[stack.last])
              break
            else
              output.push(stack.pop.chr)
            end
          }
        end
        stack.push(asc)
      else
        temp += asc.chr
    end
  end
  stack.each{|asc| output.push(asc.chr)}
  !output.include?('(') or raise 'Missing ")"!'
  output.delete('')
  output
end
Ezra Z. (Guest)
on 2007-04-29 04:42
(Received via mailing list)
Hi~

On Apr 28, 2007, at 4:34 PM, Peter M. wrote:

> I'm still getting to grips with Ruby, but I think I'm doing well. My
> last post generated a lot of positive comments/advice which really
> helped me, so I thought I'd post some more of my code to try and
> repeat
> that!
>
> The following is a method which will convert an infix expression (1+1)
> to a postfix expression (1 1 +). It expects a string and returns an
> array. Once again I'd really appreciate your comments.
>

  I have a very similar approach in a little parser I wrote for role
based auth system. It uses a stack and converts to postfix before
evaluating the expressions. It can parse strings like  "(admin |
moderator & !blacklist) | root" and compare to roles. I thought you
might like to see it as it's very similar to the way your infix to
postfix code works.


class RoleExp

   def initialize(expression = nil)
     @expression = expression
     # using @@class vars here so they don't show up in #inspect
     @@symbols ||= {"(" => :lparen, ")" => :rparen, "&" => :and, "|"
=> :or, "!" => :not}
     @@precedence ||= {:lparen => 0, :rparen => 0, :and => 1, :or =>
1, :not => 2}
     compile unless @expression.empty?
   end

   def compile(expression = nil)
     @expression = expression || @expression
     @postfix = []
     stack = []
     @expression.scan(/(\w+|\W)/).flatten.each do |term|
       term = @@symbols[term] || term.strip
       next if term.to_s.empty?
       case term
         when :lparen
           stack.push term
         when :and, :or, :not
           until stack.empty?
             break if @@precedence[term] > @@precedence[stack.last]
             @postfix << stack.pop
           end
           stack.push term
         when :rparen
           while stack.last != :lparen
             @postfix << stack.pop
           end
           stack.pop
         else
         @postfix << term
       end
     end
     @postfix.concat(stack.reverse)
   end

   def compute
     return false if @postfix.empty?
     stack = []
     @postfix.each do |term|
       case term
         when :and
           rhs = stack.pop
           stack.push(stack.pop && rhs)
         when :or
           rhs = stack.pop
           stack.push(stack.pop || rhs)
         when :not
           stack.push(!stack.pop)
         else
           stack.push(yield(term))
       end
     end
     stack.first
   end

end

class User
   attr_accessor :roles
   def initialize(*roles)
     @roles = roles
   end
end


bool = RoleExp.new "(admin | moderator & !blacklist)"
p bool
puts '='*40

user = User.new 'admin', 'moderator'
p user
p bool.compute {|term| user.roles.include? term}
puts '='*40

user.roles << 'blacklist'
p user
p bool.compute {|term| user.roles.include? term}


Cheers-
-- Ezra Z.
-- Lead Rails Evangelist
-- removed_email_address@domain.invalid
-- Engine Y., Serious Rails Hosting
-- (866) 518-YARD (9273)
Paul B. (Guest)
on 2007-04-29 05:47
(Received via mailing list)
On Sun, Apr 29, 2007 at 08:34:33AM +0900, Peter M. wrote:
>   precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}

In general it's better not to explicitly specify ASCII values, so I'd
write instead:

    precedence = {nil=>0, ?(=>0, ?*=>2, ?+=>1, ?-=>1, ?/=>2}

Note though that on newer versions of ruby, ?x will give you a string
containing "x" rather than the ascii value of x.  So to be more
compatible, instead of writing:

>   expression.each_byte do |asc|

you'll want to write:

    expression.each_byte do |c|
      c = c.chr if ?x == "x"

and use c wherever you were using asc before.

(I feel like there should be a better way to do this, but I haven't
found one.  Pity there's no String#each_char).

Paul
Peter M. (Guest)
on 2007-04-29 14:01
Paul B. wrote:
> On Sun, Apr 29, 2007 at 08:34:33AM +0900, Peter M. wrote:
>>   precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}
>
> In general it's better not to explicitly specify ASCII values, so I'd
> write instead:

Could you explain why please, I can't see how it makes any difference,
really (but I am a noob!).
Dan Z. (Guest)
on 2007-04-29 14:36
(Received via mailing list)
Peter M. wrote:
> Paul B. wrote:
>> On Sun, Apr 29, 2007 at 08:34:33AM +0900, Peter M. wrote:
>>>   precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}
>> In general it's better not to explicitly specify ASCII values, so I'd
>> write instead:
>
> Could you explain why please, I can't see how it makes any difference,
> really (but I am a noob!).
>

In a high level language, it's good to use the high level functions.
That way if low level representations of things ever change in a newer
version, your existing scripts will be more likely to work in the newer
version.

Consider this example, even though it's probably not a very good one:
What if the upcoming unicode support changed something with the internal
representation of strings so that "*" was no longer represented as "42"?
Then the entry in the hash {42 => 2} would no longer be applicable, but
{?* => 2} will continue to work, because if ruby's internal
representation of "*" changes in some version, so will the value of ?*.

It also makes me happy that someone else is working on a converter from
infix to postfix. I just (almost) finished implementing Dijkstra's
Shunting Yard algorithm
(http://en.wikipedia.org/wiki/Shunting_yard_algorithm) in Ruby, as part
of a bigger project. If you want to look at it, it's at
http://paste-bin.com/11526, but it's 227 lines of poorly commented meaty
algorithm. It currently works on mathematical equations with variables.
It doesn't do any evaluation, though--just parses to an array.

Anyway, have fun.
Dan
Peter M. (Guest)
on 2007-04-29 14:47
Thanks for that explination, it makes quite a lot of sense!
Thomas H. (Guest)
on 2007-04-29 22:35
(Received via mailing list)
* Dan Z. (removed_email_address@domain.invalid) wrote:

>  It also makes me happy that someone else is working on a converter
>  from infix to postfix. I just (almost) finished implementing
>  Dijkstra's Shunting Yard algorithm
>  (http://en.wikipedia.org/wiki/Shunting_yard_algorithm) in Ruby, as
>  part of a bigger project.  If you want to look at it, it's at
>  http://paste-bin.com/11526, but it's 227 lines of poorly commented
>  meaty algorithm. It currently works on mathematical equations with
>  variables. It doesn't do any evaluation, though--just parses to an
>  array.

Your implementation is very similar to mine (though I'm parsing boolean
search queries, ala Google) -- mine is a bit simpler:

    InputPriority = Hash.new(1000)
    InputPriority[:and]   = 800
    InputPriority[:or]    = 500
    InputPriority[:not]   = 900
    InputPriority[nil]    = 0
    InputPriority[:open]  = 1000
    InputPriority[:close] = 1
    StackPriority = InputPriority.dup
    StackPriority[:open]  = 1
    StackPriority[:close] = 1000

    def rpnise(itokens)
      stack = []
      instructions = []

      itokens.each do |itoken|
        if InputPriority[itoken] > StackPriority[stack.last]
          stack.push(itoken)
        elsif InputPriority[itoken] <= StackPriority[stack.last]
          while StackPriority[stack.last] >= InputPriority[itoken]
            instructions << stack.pop
          end
          stack.push(itoken)
        end
      end
      instructions.concat(stack.reverse)
    end

After this I sanitize the token stream and use it to build a tree of
BooleanAnd/Or/Not/SubStringQuery objects to represent the search (which
then rewrite themselves in a vague attempt at optimization and sorting).

Demo:
  # Token stream
  irb(main):009:0> toks = analyser.simplify(tokenizer.tokenize("foo AND
bar OR (wibble OR wobble NOT foo)"))
  [#<NSearch::SubStringQuery:0x18a4a20 @substr="foo">,
   :and,
   #<NSearch::SubStringQuery:0x18a49f8 @substr="bar">,
   :or,
   :open,
   #<NSearch::SubStringQuery:0x18a49a8 @substr="wibble">,
   :or,
   #<NSearch::SubStringQuery:0x18a4958 @substr="wobble">,
   :and,
   :not,
   #<NSearch::SubStringQuery:0x18a4908 @substr="foo">,
   :close]

   irb(main):010:0> analyser.rpnise(toks)
   [#<NSearch::SubStringQuery:0x188a670 @substr="foo">,
    #<NSearch::SubStringQuery:0x188a648 @substr="bar">,
    :and,
    #<NSearch::SubStringQuery:0x188a5f8 @substr="wibble">,
    #<NSearch::SubStringQuery:0x188a5a8 @substr="wobble">,
    #<NSearch::SubStringQuery:0x188a558 @substr="foo">,
    :not,
    :and,
    :or,
    :or]

You can then follow the instructions, popping each SubStringQuery onto a
stack and popping them off on :and, :or and :not to build a tree of
query objects:

#to_s
  OR(OR(AND(NOT("foo"),"wobble"),"wibble"),AND("bar","foo"))

#to_sql
  (((NOT (Title LIKE "%foo%") AND Title LIKE "%wobble%") OR Title LIKE
    "%wibble%") OR (Title LIKE "%bar%" AND Title LIKE "%foo%"))


I'd recommend against using Ruby's own Generator, btw; it uses callcc
and has a tendancy towards being slow and leaky.  If you just need a
#next/#peek method, just turn it into an array and shove it in an
appropriate object which tracks the position.
Dan Z. (Guest)
on 2007-04-30 00:04
(Received via mailing list)
Thomas H. wrote:
>    #<NSearch::SubStringQuery:0x18a4958 @substr="wobble">,
>     #<NSearch::SubStringQuery:0x188a5a8 @substr="wobble">,
> #to_s
> appropriate object which tracks the position.
>

That's pretty neat. I hadn't thought of using this stuff to generate
SQL. I'll probably acually use that idea some day. And thanks for the
bit about Generator. I'm currently looking through a computational
script (ported from c++)--I could never figure out why it used all my
ram within a few seconds, but it used Generators several thousand times.
I'm rewriting parts of it now, and we'll see what happens.

Dan
Logan C. (Guest)
on 2007-04-30 00:19
(Received via mailing list)
On 4/28/07, Paul B. <removed_email_address@domain.invalid> wrote:
>
>
> (I feel like there should be a better way to do this, but I haven't
> found one.  Pity there's no String#each_char).


There is a String#each_char, but it's spelled a little funny ;) :

"abc".scan(/./m) do |char|
    p char
end


Paul
Thomas H. (Guest)
on 2007-04-30 03:32
(Received via mailing list)
* Dan Z. (removed_email_address@domain.invalid) wrote:

>  That's pretty neat. I hadn't thought of using this stuff to generate
>  SQL. I'll probably acually use that idea some day.

It can perform the match itself too:

 irb(main):025:0> query = NSearch::QueryParser.new.parse('foo bar NOT
heh')
 #<NSearch::RootQuery:0x1715380
  @query=
   #<NSearch::BooleanAndQuery:0x1714fe8
    @criteria=
     [#<NSearch::SubStringQuery:0x1715240 @substr="bar">,
      #<NSearch::SubStringQuery:0x1715268 @substr="foo">,
      #<NSearch::BooleanNotQuery:0x1715038
       @criteria=#<NSearch::SubStringQuery:0x17151f0 @substr="heh">>]>>

I can then query.rewrite to have it normalize itself (sort, minor
optimization), have it turn itself into various forms (SQL, boolean),
and of course execute it:

 irb(main):029:0> query.match('foo bar')
 => true
 irb(main):030:0> query.match('foo bar heh')
 => false

I also have a C library and a Ruby extension which uses it which
accelerates matches.  We're currently using it in our new search engine.
Performance is on the order of 5 million matches/second on a single
2GHz Opteron.  In pure Ruby it's closer to 200,000/sec.

I'm planning to_ferret to produce an equivilent ferret query, and
extending it to provide tagged queries: "category:ruby lang:en OR
lang:de".  Obviously the trickiest bit here is making it work in C.

None of this is public, alas, but if there's interest..

> And thanks for the bit about Generator. I'm currently looking through
> a computational script (ported from c++)--I could never figure out why
> it used all my ram within a few seconds, but it used Generators
> several thousand times. I'm rewriting parts of it now, and we'll see
> what happens.

Yeah, that'll be the callcc.  I really hope it's improved for 1.9.
This topic is locked and can not be replied to.