Oniguruma question

For those adventurous souls who are already using Oniguruma…

With the following Oniguruma re:

print “aabb”.index(/.+?(?<!a)/)
print “\n”

My reasoning tells me I should be getting 3, but I’m getting 0.
Basically, the .+ should force the re ‘cursor’ past at least the first
a; after that, the negative lookback should force it past the first b.
Could someone explain where I’m misunderstanding?

Thanks,
Ken

2008/9/8 Kenneth McDonald [email protected]:

explain where I’m misunderstanding?
Your reasoning about “.+” is wrong. Non greedy matching does only mean
that the /end point/ of a match might be moved to the left, but never
the /start point/ be moved to the right. So .+ matches directly at
the beginning of the string.

Any practical problem that you’re trying to solve (and we can try to
tackle)? :slight_smile:

Kind regards

robert

I’m writing a library to make re’s easier to use :slight_smile: This is in the
test suite.

Thanks for the feedback, that comes as a real surprise to me. So maybe
I’ll try /…+?/

Cheers,
Ken

On Mon, Sep 08, 2008 at 10:18:01PM +0900, Kenneth McDonald wrote:

Could someone explain where I’m misunderstanding?
Not exactly an answer, but . . .

Why use two print statements, the second just to add a newline
character,
rather than using puts?

'Cause I’m a Ruby newbie who’s still learning this stuff :slight_smile:

Ken

On Tue, Sep 09, 2008 at 11:30:56AM +0900, Kenneth McDonald wrote:

'Cause I’m a Ruby newbie who’s still learning this stuff :slight_smile:

Ah! I understand.

In that case, take that as a suggestion instead of a question:

“You can clean up that code a little by using puts instead of print.”

Something like that.

Also, I think it’s generally considered preferable on this list to avoid
TOFU[1] posting. I’m still not 100% sure how strict or official such a
rule is on ruby-talk, though.

[1]: TOFU = Text Over, Fullquote Under; also known sometimes as “top
posting”, without trimming quoted text

It was part of a unit testing suite. I did figure out where I was
going wrong, a silly mistake on my part which I should’ve been past
years ago. Thanks for the help.

Ken

On 08.09.2008 19:38, Kenneth McDonald wrote:

I’m writing a library to make re’s easier to use :slight_smile: This is in the
test suite.

Thanks for the feedback, that comes as a real surprise to me. So maybe
I’ll try /…+?/

This one won’t match after “aa”. Instead it will match “aa”. I do not
know what you intend to match so I cannot really comment. This is
clearly a toy example, isn’t it? There are multiple ways to match “bb”
in “aabb” and it is completely unclear to me what you are after.

Your original RX literally translated means “match anything which is not
preceded by a” which is especially true for every character at the
beginning of a string.

Btw, I can recommend “Mastering Regular Expressions” - it is a really
good book on the matter.

Kind regards

robert