Finding un-escaped characters in scan()


#1

I have a string that looks like this: “test,test”. And I’m trying to
count the commas which don’t follow a . But when using scan(/[^\],/)
it
only finds the first comma. I think this is because after scan() has
found a
comma without a backslash, it continues from where it left off, and
since
there’s no character before the next comma, it won’t match it. Does
anyone
know how to fix it?


#2

backslash, it continues from where it left off, and since
there’s no character before the next comma, it won’t match
it. Does anyone know how to fix it?

I don’t see that behaviour:

irb(main):001:0> “test,test,test2,test3”.scan(/[^\],/)
=> [“t,”, “,”, “t,”, “2,”]
irb(main):002:0>

4 total, as expected.


#3

On 7/28/07, Fred P. removed_email_address@domain.invalid wrote:

I have a string that looks like this: “test,test”. And I’m trying to
count the commas which don’t follow a . But when using scan(/[^\],/) it
only finds the first comma. I think this is because after scan() has found a
comma without a backslash, it continues from where it left off, and since
there’s no character before the next comma, it won’t match it. Does anyone
know how to fix it?

Works for me:

irb(main):001:0> “test,test”.scan(/[^\],/)
=> [“t,”, “,”]

Todd


#4

Fred P. wrote:

I have a string that looks like this: “test,test”. And I’m trying to
count the commas which don’t follow a . But when using scan(/[^\],/)
it
only finds the first comma. I think this is because after scan() has
found a
comma without a backslash, it continues from where it left off, and
since
there’s no character before the next comma, it won’t match it. Does
anyone
know how to fix it?

Just wanted to point out a mistake the OP and all of the replying people
made:
“,” # => “,”
The , is not escaped. You need ‘,’ # => ‘,\,’

Regards
Stefan


#5

Hmm, it appears I made a typo in my actual code. But this still isn’t
the
result I want.

For instance “test,test” only returns one comma. And the results you
have
shown above, are returning the wrong number as well.


#6

On 7/28/07, Fred P. removed_email_address@domain.invalid wrote:

Hmm, it appears I made a typo in my actual code. But this still isn’t the
result I want.

For instance “test,test” only returns one comma. And the results you have
shown above, are returning the wrong number as well.

Fred:

Ah, I see what you are saying in your original post.

“test,test”.scan(/.,/)
=> [“t,”]

It should find “,” too, right?

Interesting behavior and has nothing to do with backslashes, just the
fact that scan won’t find it if the character is adjacent to itself.
Like you said, where it starts the next scan, maybe. I’m not sure how
to get around it without playing around a bit.

Todd


#7

On 7/28/07, Fred P. removed_email_address@domain.invalid wrote:

I have a string that looks like this: “test,test”. And I’m trying to
count the commas which don’t follow a . But when using scan(/[^\],/) it
only finds the first comma. I think this is because after scan() has found a
comma without a backslash, it continues from where it left off, and since
there’s no character before the next comma, it won’t match it. Does anyone
know how to fix it?

Are you expecting something like this,

str = “test,test”
p str.scan(/,/) #[","]

to give you [",",","] ?

Harry


#8

Excellent, that works like a charm. Thanks.


#9

Fred P. wrote:

I have a string that looks like this: “test,test”. And I’m trying to
count the commas which don’t follow a . But when using scan(/[^\],/)
it
only finds the first comma. I think this is because after scan() has
found a
comma without a backslash, it continues from where it left off, and
since
there’s no character before the next comma, it won’t match it. Does
anyone
know how to fix it?

If you can use Ruby 1.9, you simly can express it using ‘negative look
behind’:

irb(main):001:0> “test,\,test”.scan(/(?<!\),/)
=> [",", “,”]
irb(main):002:0> “test,\,test,test2,\,test3”.scan(/(?<!\),/)
=> [",", “,”, “,”, “,”]

Wolfgang Nádasi-Donner


#10

On 7/28/07, Harry K. removed_email_address@domain.invalid wrote:

str = “test,test”
p str.scan(/,/) #[","]

to give you [",",","] ?

Harry

If so,
here is one way.
There is probably a more elegant way.

require ‘enumerator’
str = “test,\,\,”
str.split(//).each_cons(2){|x|p x[1] if x[0] != “\” and x[1] == “,”}

Harry