Infix to Postfix Method


#1

I’m still getting to grips with Ruby, but I think I’m doing well. My
last post generated a lot of positive comments/advice which really
helped me, so I thought I’d post some more of my code to try and repeat
that!

The following is a method which will convert an infix expression (1+1)
to a postfix expression (1 1 +). It expects a string and returns an
array. Once again I’d really appreciate your comments.

def postfix(expression)

precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}

output = Array.new
stack = Array.new
temp = ‘’

expression.each_byte do |asc|

case asc
  when 41
    output.push(temp)
    temp = ''
    loop{
      if stack.empty? == false
        popped = stack.pop
        if popped == 40
          break
        else
          output.push(popped.chr)
        end
      else
        raise 'Missing "("!'
      end
    }
  when 40,42, 43, 45, 47 #*+-/
    output.push(temp)
    temp = ''
    unless asc == 40
      p_asc = precedence[asc]
      loop{
        if  (p_asc > precedence[stack.last])
          break
        else
          output.push(stack.pop.chr)
        end
      }
    end
    stack.push(asc)
  else
    temp += asc.chr
end

end
stack.each{|asc| output.push(asc.chr)}
!output.include?(’(’) or raise ‘Missing “)”!’
output.delete(’’)
output
end


#2

Hi~

On Apr 28, 2007, at 4:34 PM, Peter M. wrote:

I’m still getting to grips with Ruby, but I think I’m doing well. My
last post generated a lot of positive comments/advice which really
helped me, so I thought I’d post some more of my code to try and
repeat
that!

The following is a method which will convert an infix expression (1+1)
to a postfix expression (1 1 +). It expects a string and returns an
array. Once again I’d really appreciate your comments.

I have a very similar approach in a little parser I wrote for role
based auth system. It uses a stack and converts to postfix before
evaluating the expressions. It can parse strings like “(admin |
moderator & !blacklist) | root” and compare to roles. I thought you
might like to see it as it’s very similar to the way your infix to
postfix code works.

class RoleExp

def initialize(expression = nil)
@expression = expression
# using @@class vars here so they don’t show up in #inspect
@@symbols ||= {"(" => :lparen, “)” => :rparen, “&” => :and, “|”
=> :or, “!” => :not}
@@precedence ||= {:lparen => 0, :rparen => 0, :and => 1, :or =>
1, :not => 2}
compile unless @expression.empty?
end

def compile(expression = nil)
@expression = expression || @expression
@postfix = []
stack = []
@expression.scan(/(\w+|\W)/).flatten.each do |term|
term = @@symbols[term] || term.strip
next if term.to_s.empty?
case term
when :lparen
stack.push term
when :and, :or, :not
until stack.empty?
break if @@precedence[term] > @@precedence[stack.last]
@postfix << stack.pop
end
stack.push term
when :rparen
while stack.last != :lparen
@postfix << stack.pop
end
stack.pop
else
@postfix << term
end
end
@postfix.concat(stack.reverse)
end

def compute
return false if @postfix.empty?
stack = []
@postfix.each do |term|
case term
when :and
rhs = stack.pop
stack.push(stack.pop && rhs)
when :or
rhs = stack.pop
stack.push(stack.pop || rhs)
when :not
stack.push(!stack.pop)
else
stack.push(yield(term))
end
end
stack.first
end

end

class User
attr_accessor :roles
def initialize(*roles)
@roles = roles
end
end

bool = RoleExp.new “(admin | moderator & !blacklist)”
p bool
puts ‘=’*40

user = User.new ‘admin’, ‘moderator’
p user
p bool.compute {|term| user.roles.include? term}
puts ‘=’*40

user.roles << ‘blacklist’
p user
p bool.compute {|term| user.roles.include? term}

Cheers-
– Ezra Z.
– Lead Rails Evangelist
– removed_email_address@domain.invalid
– Engine Y., Serious Rails Hosting
– (866) 518-YARD (9273)


#3

Paul B. wrote:

On Sun, Apr 29, 2007 at 08:34:33AM +0900, Peter M. wrote:

precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}

In general it’s better not to explicitly specify ASCII values, so I’d
write instead:

Could you explain why please, I can’t see how it makes any difference,
really (but I am a noob!).


#4

On Sun, Apr 29, 2007 at 08:34:33AM +0900, Peter M. wrote:

precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}

In general it’s better not to explicitly specify ASCII values, so I’d
write instead:

precedence = {nil=>0, ?(=>0, ?*=>2, ?+=>1, ?-=>1, ?/=>2}

Note though that on newer versions of ruby, ?x will give you a string
containing “x” rather than the ascii value of x. So to be more
compatible, instead of writing:

expression.each_byte do |asc|

you’ll want to write:

expression.each_byte do |c|
  c = c.chr if ?x == "x"

and use c wherever you were using asc before.

(I feel like there should be a better way to do this, but I haven’t
found one. Pity there’s no String#each_char).

Paul


#5

Peter M. wrote:

Paul B. wrote:

On Sun, Apr 29, 2007 at 08:34:33AM +0900, Peter M. wrote:

precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}
In general it’s better not to explicitly specify ASCII values, so I’d
write instead:

Could you explain why please, I can’t see how it makes any difference,
really (but I am a noob!).

In a high level language, it’s good to use the high level functions.
That way if low level representations of things ever change in a newer
version, your existing scripts will be more likely to work in the newer
version.

Consider this example, even though it’s probably not a very good one:
What if the upcoming unicode support changed something with the internal
representation of strings so that "" was no longer represented as “42”?
Then the entry in the hash {42 => 2} would no longer be applicable, but
{?
=> 2} will continue to work, because if ruby’s internal
representation of "" changes in some version, so will the value of ?.

It also makes me happy that someone else is working on a converter from
infix to postfix. I just (almost) finished implementing Dijkstra’s
Shunting Yard algorithm
(http://en.wikipedia.org/wiki/Shunting_yard_algorithm) in Ruby, as part
of a bigger project. If you want to look at it, it’s at
http://paste-bin.com/11526, but it’s 227 lines of poorly commented meaty
algorithm. It currently works on mathematical equations with variables.
It doesn’t do any evaluation, though–just parses to an array.

Anyway, have fun.
Dan


#6

Thanks for that explination, it makes quite a lot of sense!


#7
  • Dan Z. (removed_email_address@domain.invalid) wrote:

It also makes me happy that someone else is working on a converter
from infix to postfix. I just (almost) finished implementing
Dijkstra’s Shunting Yard algorithm
(http://en.wikipedia.org/wiki/Shunting_yard_algorithm) in Ruby, as
part of a bigger project. If you want to look at it, it’s at
http://paste-bin.com/11526, but it’s 227 lines of poorly commented
meaty algorithm. It currently works on mathematical equations with
variables. It doesn’t do any evaluation, though–just parses to an
array.

Your implementation is very similar to mine (though I’m parsing boolean
search queries, ala Google) – mine is a bit simpler:

InputPriority = Hash.new(1000)
InputPriority[:and]   = 800
InputPriority[:or]    = 500
InputPriority[:not]   = 900
InputPriority[nil]    = 0
InputPriority[:open]  = 1000
InputPriority[:close] = 1
StackPriority = InputPriority.dup
StackPriority[:open]  = 1
StackPriority[:close] = 1000

def rpnise(itokens)
  stack = []
  instructions = []

  itokens.each do |itoken|
    if InputPriority[itoken] > StackPriority[stack.last]
      stack.push(itoken)
    elsif InputPriority[itoken] <= StackPriority[stack.last]
      while StackPriority[stack.last] >= InputPriority[itoken]
        instructions << stack.pop
      end
      stack.push(itoken)
    end
  end
  instructions.concat(stack.reverse)
end

After this I sanitize the token stream and use it to build a tree of
BooleanAnd/Or/Not/SubStringQuery objects to represent the search (which
then rewrite themselves in a vague attempt at optimization and sorting).

Demo:

Token stream

irb(main):009:0> toks = analyser.simplify(tokenizer.tokenize(“foo AND
bar OR (wibble OR wobble NOT foo)”))
[#<NSearch::SubStringQuery:0x18a4a20 @substr=“foo”>,
:and,
#<NSearch::SubStringQuery:0x18a49f8 @substr=“bar”>,
:or,
:open,
#<NSearch::SubStringQuery:0x18a49a8 @substr=“wibble”>,
:or,
#<NSearch::SubStringQuery:0x18a4958 @substr=“wobble”>,
:and,
:not,
#<NSearch::SubStringQuery:0x18a4908 @substr=“foo”>,
:close]

irb(main):010:0> analyser.rpnise(toks)
[#<NSearch::SubStringQuery:0x188a670 @substr=“foo”>,
#<NSearch::SubStringQuery:0x188a648 @substr=“bar”>,
:and,
#<NSearch::SubStringQuery:0x188a5f8 @substr=“wibble”>,
#<NSearch::SubStringQuery:0x188a5a8 @substr=“wobble”>,
#<NSearch::SubStringQuery:0x188a558 @substr=“foo”>,
:not,
:and,
:or,
:or]

You can then follow the instructions, popping each SubStringQuery onto a
stack and popping them off on :and, :or and :not to build a tree of
query objects:

#to_s
OR(OR(AND(NOT(“foo”),“wobble”),“wibble”),AND(“bar”,“foo”))

#to_sql
(((NOT (Title LIKE “%foo%”) AND Title LIKE “%wobble%”) OR Title LIKE
“%wibble%”) OR (Title LIKE “%bar%” AND Title LIKE “%foo%”))

I’d recommend against using Ruby’s own Generator, btw; it uses callcc
and has a tendancy towards being slow and leaky. If you just need a
#next/#peek method, just turn it into an array and shove it in an
appropriate object which tracks the position.


#8

On 4/28/07, Paul B. removed_email_address@domain.invalid wrote:

(I feel like there should be a better way to do this, but I haven’t
found one. Pity there’s no String#each_char).

There is a String#each_char, but it’s spelled a little funny :wink: :

“abc”.scan(/./m) do |char|
p char
end

Paul


#9

Thomas H. wrote:

#<NSearch::SubStringQuery:0x18a4958 @substr=“wobble”>,
#<NSearch::SubStringQuery:0x188a5a8 @substr=“wobble”>,
#to_s
appropriate object which tracks the position.

That’s pretty neat. I hadn’t thought of using this stuff to generate
SQL. I’ll probably acually use that idea some day. And thanks for the
bit about Generator. I’m currently looking through a computational
script (ported from c++)–I could never figure out why it used all my
ram within a few seconds, but it used Generators several thousand times.
I’m rewriting parts of it now, and we’ll see what happens.

Dan


#10
  • Dan Z. (removed_email_address@domain.invalid) wrote:

That’s pretty neat. I hadn’t thought of using this stuff to generate
SQL. I’ll probably acually use that idea some day.

It can perform the match itself too:

irb(main):025:0> query = NSearch::QueryParser.new.parse(‘foo bar NOT
heh’)
#<NSearch::RootQuery:0x1715380
@query=
#<NSearch::BooleanAndQuery:0x1714fe8
@criteria=
[#<NSearch::SubStringQuery:0x1715240 @substr=“bar”>,
#<NSearch::SubStringQuery:0x1715268 @substr=“foo”>,
#<NSearch::BooleanNotQuery:0x1715038
@criteria=#<NSearch::SubStringQuery:0x17151f0 @substr=“heh”>>]>>

I can then query.rewrite to have it normalize itself (sort, minor
optimization), have it turn itself into various forms (SQL, boolean),
and of course execute it:

irb(main):029:0> query.match(‘foo bar’)
=> true
irb(main):030:0> query.match(‘foo bar heh’)
=> false

I also have a C library and a Ruby extension which uses it which
accelerates matches. We’re currently using it in our new search engine.
Performance is on the order of 5 million matches/second on a single
2GHz Opteron. In pure Ruby it’s closer to 200,000/sec.

I’m planning to_ferret to produce an equivilent ferret query, and
extending it to provide tagged queries: “category:ruby lang:en OR
lang:de”. Obviously the trickiest bit here is making it work in C.

None of this is public, alas, but if there’s interest…

And thanks for the bit about Generator. I’m currently looking through
a computational script (ported from c++)–I could never figure out why
it used all my ram within a few seconds, but it used Generators
several thousand times. I’m rewriting parts of it now, and we’ll see
what happens.

Yeah, that’ll be the callcc. I really hope it’s improved for 1.9.