Forum: Ruby Finding un-escaped characters in scan()

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Fred P. (Guest)
on 2007-07-28 17:04
(Received via mailing list)
I have a string that looks like this: "test,,\,test". And I'm trying to
count the commas which don't follow a \. But when using scan(/[^\\],/)
it
only finds the first comma. I think this is because after scan() has
found a
comma without a backslash, it continues from where it left off, and
since
there's no character before the next comma, it won't match it. Does
anyone
know how to fix it?
Felix W. (Guest)
on 2007-07-28 17:21
(Received via mailing list)
> backslash, it continues from where it left off, and since
> there's no character before the next comma, it won't match
> it. Does anyone know how to fix it?
>

I don't see that behaviour:

irb(main):001:0> "test,,\,test,test2,\,test3".scan(/[^\\],/)
=> ["t,", ",,", "t,", "2,"]
irb(main):002:0>

4 total, as expected.
Todd B. (Guest)
on 2007-07-28 17:22
(Received via mailing list)
On 7/28/07, Fred P. <removed_email_address@domain.invalid> wrote:
> I have a string that looks like this: "test,,\,test". And I'm trying to
> count the commas which don't follow a \. But when using scan(/[^\\],/) it
> only finds the first comma. I think this is because after scan() has found a
> comma without a backslash, it continues from where it left off, and since
> there's no character before the next comma, it won't match it. Does anyone
> know how to fix it?
>

Works for me:

irb(main):001:0> "test,,\,test".scan(/[^\\],/)
=> ["t,", ",,"]

Todd
Fred P. (Guest)
on 2007-07-28 17:31
(Received via mailing list)
Hmm, it appears I made a typo in my actual code. But this still isn't
the
result I want.

For instance "test,,test" only returns one comma. And the results you
have
shown above, are returning the wrong number as well.
Stefan R. (Guest)
on 2007-07-28 17:41
Fred P. wrote:
> I have a string that looks like this: "test,,\,test". And I'm trying to
> count the commas which don't follow a \. But when using scan(/[^\\],/)
> it
> only finds the first comma. I think this is because after scan() has
> found a
> comma without a backslash, it continues from where it left off, and
> since
> there's no character before the next comma, it won't match it. Does
> anyone
> know how to fix it?

Just wanted to point out a mistake the OP and all of the replying people
made:
",,\,," # => ",,,,"
The , is *not* escaped. You need ',,\,,' # => ',,\\,,'

Regards
Stefan
Harry K. (Guest)
on 2007-07-28 17:55
(Received via mailing list)
On 7/28/07, Fred P. <removed_email_address@domain.invalid> wrote:
> I have a string that looks like this: "test,,\,test". And I'm trying to
> count the commas which don't follow a \. But when using scan(/[^\\],/) it
> only finds the first comma. I think this is because after scan() has found a
> comma without a backslash, it continues from where it left off, and since
> there's no character before the next comma, it won't match it. Does anyone
> know how to fix it?
>
Are you expecting something like this,

str = "test,,,test"
p str.scan(/,,/) #[",,"]

to give you [",,",",,"]     ?

Harry
Todd B. (Guest)
on 2007-07-28 18:25
(Received via mailing list)
On 7/28/07, Fred P. <removed_email_address@domain.invalid> wrote:
> Hmm, it appears I made a typo in my actual code. But this still isn't the
> result I want.
>
> For instance "test,,test" only returns one comma. And the results you have
> shown above, are returning the wrong number as well.
>

Fred:

Ah, I see what you are saying in your original post.

"test,,test".scan(/.,/)
=> ["t,"]

It should find ",," too, right?

Interesting behavior and has nothing to do with backslashes, just the
fact that scan won't find it if the character is adjacent to itself.
Like you said, where it starts the next scan, maybe.  I'm not sure how
to get around it without playing around a bit.

Todd
Harry K. (Guest)
on 2007-07-28 20:09
(Received via mailing list)
On 7/28/07, Harry K. <removed_email_address@domain.invalid> wrote:
> str = "test,,,test"
> p str.scan(/,,/) #[",,"]
>
> to give you [",,",",,"]     ?
>
> Harry

If so,
here is one way.
There is probably a more elegant way.

require 'enumerator'
str = "test,,\\,,,\\,,"
str.split(//).each_cons(2){|x|p x[1] if x[0] != "\\" and x[1] == ","}

Harry
Fred P. (Guest)
on 2007-07-28 21:01
(Received via mailing list)
Excellent, that works like a charm. Thanks.
Wolfgang N. (Guest)
on 2007-07-30 23:50
Fred P. wrote:
> I have a string that looks like this: "test,,\,test". And I'm trying to
> count the commas which don't follow a \. But when using scan(/[^\\],/)
> it
> only finds the first comma. I think this is because after scan() has
> found a
> comma without a backslash, it continues from where it left off, and
> since
> there's no character before the next comma, it won't match it. Does
> anyone
> know how to fix it?

If you can use Ruby 1.9, you simly can express it using 'negative look
behind':


irb(main):001:0> "test,,\\,test".scan(/(?<!\\),/)
=> [",", ","]
irb(main):002:0> "test,,\\,test,test2,\\,test3".scan(/(?<!\\),/)
=> [",", ",", ",", ","]

Wolfgang Nádasi-Donner
This topic is locked and can not be replied to.