Hello.
I’m trying to get rex to parse my inputs. After reading some of the
sample
files provided with rex, I created this simple(?) file:
file:
test.rex------------------------------------------------------------
-- ruby --
##########################################################################
class Lexer
macro
BLANKS \s+
DIGITS \d+
LETTERS [a-zA-Z]+
rule
{BLANKS}
{LETTERS} { puts “ID: ‘@{text}’”; [ :ID, text ] }
{DIGITS} { puts “NUMBER: ‘@{text}’”; [ :NUMBER, text.to_f ] }
.|\n { puts “text: ‘@{text}’”; [ text, text ] }
inner
end
##########################################################################
lexer=Lexer.new
while 1
str=$stdin.gets.strip
puts “str=@{str}”
lexer.scan_str(str)
puts
“--------------------------------------------------------------------------”
end
end of file:
test.rex-----------------------------------------------------
After ‘rex test.rex’, and ‘ruby -Ku test.rex.rb’, I always get errors
like
test.rex.rb:60:in scan_evaluate’: can not match: ‘2’ (Lexer::ScanError)
when I type input.
Can anybody tell me why? Thanks.
Hey Fabrice,
The problem you’re seeing is due to rex’s assumption that you are
generating a parser in tandem with your lexer. The generated method
Lexer::scan_str looks like this:
def scan_str( str )
scan_evaluate str
do_parse
end
While scan_evaluate(str) is the method generated by your token
definitions, do_parse() depends on a racc grammar having been defined
and initialized. The bad news is that the default scan_str() won’t
work for your purposes. The good news is that scan_evaluate() will. If
you examine your generated test.rex.rb file, you’ll see that
scan_evaluate() identifies your tokens and pushes them one by one into
a queue named @rex_tokens. To pull them out of the queue, simply call
next_token(). Here’s a quick replacement for the bottom of your token
definition file:
lexer=Lexer.new
while 1
str=$stdin.gets.strip
puts “str=#{str}”
Here we’re scanning the string for tokens
lexer.scan_evaluate(str)
And then printing each one out to stdout
while token = lexer.next_token
p token
end
puts
“--------------------------------------------------------------------------”
end
The only other minor change I made was to “@{str}”. The ruby string
interpolation escape sequence is actually “#{ }”. Let us know if you
have more questions. Happy lexing!
-Nick