Treetop positive lookahead problem

Hi,

I just started plyaing around with treetop and have probably barely
touched the surface of possibilities with this great tool. Anyway I have
a problem that I think should be quite trivial.

My test rule is:

rule test
‘foo’ &‘bar’ {
def value
text_value
end
}
end

This should as I understand it parse the string “foobar” and return only
“foo”. For some reason this returns nil. I tried the opposite (switched
‘&’ with ‘!’) and that works like a charm.

I run: ruby 1.8.4 ( maybe time to upgrade i know.)
and treetop-1.2.4 ( also tested with treetop from git repo.)

Maybe I have missed out on something fundamental and I will appreciate
any response to help me with this matter.

Best regards,

Tom

Tom Aadland wrote:

end

This should as I understand it parse the string “foobar” and return only
“foo”. For some reason this returns nil. I tried the opposite (switched
‘&’ with ‘!’) and that works like a charm.

Tom,

I don’t think you’ve given us enough context answer the question.
Positive lookahead does work in this situation, but when you say
that something “returns nil”. you’re saying that the whole parse fails.
The rule might succeed yet the parser may still fail and return nil
for a different reason.

For us to know why, you should post the entire grammar, together with
the text you’re parsing, and preferably also a snippet showing how
you’re calling the parser.

Clifford H…

Clifford H. wrote:

Tom Aadland wrote:

This should as I understand it parse the string “foobar” and return only
“foo”. For some reason this returns nil. I tried the opposite (switched
‘&’ with ‘!’) and that works like a charm.

Tom,

For us to know why, you should post the entire grammar, together with
the text you’re parsing, and preferably also a snippet showing how
you’re calling the parser.

Hi Clifford,

Thank you for responding and I’m sorry I didn’t provide more
information. Also as I said I think I was missing something fundamental.
I was gonna paste this in pastie, but pastie is down :frowning:

I thought that the ‘foo’ &‘bar’ rule meant that bar has to come togehter
with foo as in foobar and on top of that not really consume bar so that
the text_value returned would only be foo. However the following test
below runs fine with that in mind :slight_smile:

What I want with this rule is: match foobar or even foobarsomething, but
return the textvalue without bar.

grammar FooGrammar

rule test_with_bar
'foo' (&'bar' .+) {
  def value
    text_value
  end
}

end

end

require ‘rubygems’
require ‘treetop’
require ‘test/unit’

class TestFoo < Test::Unit::TestCase
Treetop.load “foo_grammar”

def test_parse_foo_with_bar
res = FooGrammarParser.new.parse(“foobar”)

assert_equal res.value, "foobar"

end
end

Thanks,

Tom

Tom Aadland wrote:

Thank you for responding and I’m sorry I didn’t provide more
information.

Not a problem, we’re here to help :slight_smile:

I thought that the ‘foo’ &‘bar’ rule meant that bar has to come togehter
with foo as in foobar and on top of that not really consume bar so that
the text_value returned would only be foo.

That’s correct.

However the following test below runs fine with that in mind :slight_smile:

Your test doesn’t ask for that, it asks that foo should be followed
by a sequence of one or more of any character, starting with bar,
and it consumes all that input.

What I want with this rule is: match foobar or even foobarsomething, but
return the textvalue without bar.

I believe the only problem with your original rule is that by default,
for a Treetop parse to succeed, all input must be consumed. If you
change to the following:

parser = FooGrammarParser.new
parser.consume_all_input = false
res = parser.parse(“foobar”)

then the original rule (“foo” &“bar”) will match and return “foo”,
without the parse failing due to not having consumed all input.

You should always print “parser.failure_reason” when a parse fails,
it’ll tell you how far the parser got and what input would have allowed
it to get further.

Also, a common trap for new Treetoppers is to forget that matching is
greedy. For that reason, you almost never want to say “.*” or “.+”
because those will consume all input and leave nothing for subsequent
rules to even look at. For for example, this rule will always fail:

rule big_fail
‘foo’ .* ‘bar’
end

This must be rewritten, for example like this:

rule ok
'foo (!‘bar’ .)* ‘bar’
end

Clifford H…

Clifford,

I can’t say much more then thank you very much :). This really helped a
bunch.

Tom.