It also makes me happy that someone else is working on a converter
from infix to postfix. I just (almost) finished implementing
Dijkstra’s Shunting Yard algorithm
(Shunting yard algorithm - Wikipedia) in Ruby, as
part of a bigger project. If you want to look at it, it’s at
http://paste-bin.com/11526, but it’s 227 lines of poorly commented
meaty algorithm. It currently works on mathematical equations with
variables. It doesn’t do any evaluation, though–just parses to an
array.
Your implementation is very similar to mine (though I’m parsing boolean
search queries, ala Google) – mine is a bit simpler:
InputPriority = Hash.new(1000)
InputPriority[:and] = 800
InputPriority[:or] = 500
InputPriority[:not] = 900
InputPriority[nil] = 0
InputPriority[:open] = 1000
InputPriority[:close] = 1
StackPriority = InputPriority.dup
StackPriority[:open] = 1
StackPriority[:close] = 1000
def rpnise(itokens)
stack = []
instructions = []
itokens.each do |itoken|
if InputPriority[itoken] > StackPriority[stack.last]
stack.push(itoken)
elsif InputPriority[itoken] <= StackPriority[stack.last]
while StackPriority[stack.last] >= InputPriority[itoken]
instructions << stack.pop
end
stack.push(itoken)
end
end
instructions.concat(stack.reverse)
end
After this I sanitize the token stream and use it to build a tree of
BooleanAnd/Or/Not/SubStringQuery objects to represent the search (which
then rewrite themselves in a vague attempt at optimization and sorting).
Demo:
Token stream
irb(main):009:0> toks = analyser.simplify(tokenizer.tokenize(“foo AND
bar OR (wibble OR wobble NOT foo)”))
[#<NSearch::SubStringQuery:0x18a4a20 @substr=“foo”>,
:and,
#<NSearch::SubStringQuery:0x18a49f8 @substr=“bar”>,
:or,
:open,
#<NSearch::SubStringQuery:0x18a49a8 @substr=“wibble”>,
:or,
#<NSearch::SubStringQuery:0x18a4958 @substr=“wobble”>,
:and,
:not,
#<NSearch::SubStringQuery:0x18a4908 @substr=“foo”>,
:close]
irb(main):010:0> analyser.rpnise(toks)
[#<NSearch::SubStringQuery:0x188a670 @substr=“foo”>,
#<NSearch::SubStringQuery:0x188a648 @substr=“bar”>,
:and,
#<NSearch::SubStringQuery:0x188a5f8 @substr=“wibble”>,
#<NSearch::SubStringQuery:0x188a5a8 @substr=“wobble”>,
#<NSearch::SubStringQuery:0x188a558 @substr=“foo”>,
:not,
:and,
:or,
:or]
You can then follow the instructions, popping each SubStringQuery onto a
stack and popping them off on :and, :or and :not to build a tree of
query objects:
#to_s
OR(OR(AND(NOT(“foo”),“wobble”),“wibble”),AND(“bar”,“foo”))
#to_sql
(((NOT (Title LIKE “%foo%”) AND Title LIKE “%wobble%”) OR Title LIKE
“%wibble%”) OR (Title LIKE “%bar%” AND Title LIKE “%foo%”))
I’d recommend against using Ruby’s own Generator, btw; it uses callcc
and has a tendancy towards being slow and leaky. If you just need a
#next/#peek method, just turn it into an array and shove it in an
appropriate object which tracks the position.