lirh
1
$ ruby -v
ruby 1.9.1p429 (2010-07-02 revision 28523) [i386-darwin9]
$ irb --simple-prompt
s = “a\u4e00” # \u4e00 is the chinese character for “one” in case you can’t read the next line
=> “a一”
s.encoding
=> #Encoding:UTF-8
s.length
=> 2
s.count("^a")
=> 0
Why the above result is 0 not 1? After all there are 2 characters
in the string s. Is this a bug of String#count?
Thanks in advance.
Ruohao
lirh
2
s = “a\u4e00”
s.count("^a")
=> 0
Why the above result is 0 not 1? After all there are 2 characters
in the string s. Is this a bug of String#count?
count returns the sum of occurrences of characters. I don’t see any ^a
in the original string…
lirh
3
Roger P. wrote:
s = “a\u4e00”
s.count("^a")
=> 0
Why the above result is 0 not 1? After all there are 2 characters
in the string s. Is this a bug of String#count?
count returns the sum of occurrences of characters. I don’t see any ^a
in the original string…
But according to the documentation, “Any other_str that starts with a
caret (^)
is negated”, thus the following behavior:
$ irb --simple-prompt
s = “abc”
=> “abc”
s.count("^a")
=> 2
There are two characters in s that is not “a”.
lirh
4
Ruohao Li wrote:
Roger P. wrote:
s = “a\u4e00”
s.count("^a")
=> 0
Why the above result is 0 not 1? After all there are 2 characters
in the string s. Is this a bug of String#count?
count returns the sum of occurrences of characters. I don’t see any ^a
in the original string…
But according to the documentation, “Any other_str that starts with a
caret (^)
is negated”, thus the following behavior:
Probably because non-ASCII aren’t counted as /\w/ matching anymore. You
might want to ping core to see if it is expected or not.