Return number of matches

This seems like an easy question but I cannot find any documentation on
it anywhere on the web.

I want to write a script that will count the number of dollars in a
string, excluding escaped dollars (ie. “$”.) My first hack was to use
split and count the number of elements in the array, but I came unstuck
trying to exclude $. Surely there’s a nice simple elegant solution?

I’m very (as in this morning) new to Ruby so apologies if this is a
trivial question and I’m posting on the wrong forum.

Chris

On Feb 11, 2008 3:24 PM, Christopher C. [email protected]
wrote:

This seems like an easy question but I cannot find any documentation on
it anywhere on the web.

I want to write a script that will count the number of dollars in a
string, excluding escaped dollars (ie. “$”.) My first hack was to use
split and count the number of elements in the array, but I came unstuck
trying to exclude $. Surely there’s a nice simple elegant solution?

Easy solution: count $ and subtract $ :wink:

string.scan(/$/).size - string.scan(/\$/).size

It’s possible to do it in one regex, but it won’t be nice and elegant
and I don’t have the time now to try it…

Jano S. wrote:

It’s possible to do it in one regex, but it won’t be nice and elegant
and I don’t have the time now to try it…

I was so psyched in doing it in one regexp that I completely missed the
absolute beauty that you posted. Thanks a lot for the help.

On 11.02.2008 15:48, Christopher C. wrote:

Jano S. wrote:

It’s possible to do it in one regex, but it won’t be nice and elegant
and I don’t have the time now to try it…

I was so psyched in doing it in one regexp that I completely missed the
absolute beauty that you posted. Thanks a lot for the help.

With 1.9 you can use negative lookbehind. If that’s not available then
it becomes ugly (if it is possible at all). The simplest pre 1.9 single
pass solution that comes to mind is this

irb(main):012:0> c=0
=> 0
irb(main):013:0> “$\$$”.scan(/$/) { c+=1 unless $`[-1] == ?\ }
=> “$\$$”
irb(main):014:0> c
=> 2

Kind regards

robert

On Feb 11, 8:41 am, Jano S. [email protected] wrote:

Easy solution: count $ and subtract $ :wink:

string.scan(/$/).size - string.scan(/\$/).size

It’s possible to do it in one regex, but it won’t be nice and elegant
and I don’t have the time now to try it…

That fails when the backslash preceding a $
is itself escaped.

irb --prompt xmp
string = “$\\$”
==>“$\\$”
string.scan(/$/).size - string.scan(/\$/).size
==>1

Try this:

p DATA.read.scan(/(\.|[$])/).flatten.grep(/^.$/).size

END
$\$$$

string.scan(/[^\]$/).size

This fails on input like “$$$$”, since the string will be split into “$
$”, “$$”. It will also not find the first $ in the string but this
could be amended by using (^|[^\]).

William J. wrote:

On Feb 11, 8:41 am, Jano S. [email protected] wrote:

Easy solution: count $ and subtract $ :wink:

string.scan(/$/).size - string.scan(/\$/).size

It’s possible to do it in one regex, but it won’t be nice and elegant
and I don’t have the time now to try it…

That fails when the backslash preceding a $
is itself escaped.

irb --prompt xmp
string = “$\\$”
==>“$\\$”
string.scan(/$/).size - string.scan(/\$/).size
==>1

Erm, not sure what you’re trying to say. One is the answer I would have
wanted. Thanks anyway for the one liner though.

Just out of interest, why doesn’t my first effort work?

string.scan(/[^\]$/).size

?

On Feb 12, 3:10 am, Christopher C. [email protected] wrote:

is itself escaped.

irb --prompt xmp
string = “$\\$”
==>“$\\$”
string.scan(/$/).size - string.scan(/\$/).size
==>1

Erm, not sure what you’re trying to say. One is the answer I would have
wanted. Thanks anyway for the one liner though.

You said that you wanted to count all dollar signs except the
escaped ones. So the correct answer is 2. The first backslash
escapes the second one, which escapes nothing.

On Feb 12, 2008 10:10 AM, Christopher C. [email protected]
wrote:

is itself escaped.

irb --prompt xmp
string = “$\\$”
==>“$\\$”
string.scan(/$/).size - string.scan(/\$/).size
==>1

Erm, not sure what you’re trying to say. One is the answer I would have
wanted. Thanks anyway for the one liner though.

William’s tring to say that you have three interesting items in your
string, and you haven’t counted one:
a plain dollar $, escaped dollar $ and escaped backslash \. If
escaped backslash precedes unescaped dollar, \$
in your definition of the problem (and in my solution) it is not
counted.

William J. wrote:

You said that you wanted to count all dollar signs except the
escaped ones. So the correct answer is 2. The first backslash
escapes the second one, which escapes nothing.

Yes, thanks, I realize now. 1 was the answer I would have wanted because
it wasn’t preceded by a space, but I could have made that much clearer
in my question.

Thanks a lot guys. You’ve been most helpful. Have loads more basic
questions like that. Is this the right forum for such questions?