Doing a simple count

Hello,
I’ve got some enormous RTF files and I need to get a count of the number
of footnotes in them. So, I’m trying this:

Dir.chdir(“T:/rtf”)
file_contents = File.read(“1.rtf”)
count = file_contents.count “\footnote”
puts count

I’m getting an enormous value (115683) in the hundreds of thousands.
And, with my text editor, I know that there are only hundreds (582). Can
someone please explain why this is happening?

Thanks,
Peter

On Wednesday 16 March 2011 21:41:25 Peter B. wrote:

And, with my text editor, I know that there are only hundreds (582). Can
someone please explain why this is happening?

Thanks,
Peter

String#count doesn’t work the way you expect. It doesn’t count the
number of
occurrences of the argument in the receiver, but the number of
occurrences of
any one of the characters making up the argument. For example:

“ab ac ad”.count “ab”
=> 4

“ab ac ad”.count “ae”
=> 3

In the first examples, 4 is obtained by summing the 3 occurrences of ‘a’
and
the one occurrence of ‘b’. In the second, ‘e’ is never found, so only
the
three occurrences of ‘a’ are returned. For other examples, see the ri
documentation for String#count

To obtain what you want, you can use

file_contents.scan(/\footnote/).count

I don’t know if there’s a better way.

I hope this helps

Stefano