What I’d like is to grab all the “words” in the string. So in the above
example I’d like two matches, cost and tax.
Any ideas?
‘cost * tax’.scan(/\w+/)
=> [“cost”, “tax”]
How do you people do that? The last time I had a regexp question someone
came down from the clouds and handed me something about that short. Why
do I think it’s more difficult than it is?
After making the example a little more complex I had to change it
every-so-slightly…
Well, the regexp always matches the longest possible string.
What did you wrote is effectively equialent to ([a-z]*).
The single regexp can’t match multiple strings, it always matches
one. It can’t match the space after the ‘cost’ either, since this
symbol wasn’t included to your regexp.
In case, if you want to match two words, you should write e.g.
([[:alpha:]]+)[[:space:]]+([[:alpha:]]+)
This regexp will match two words separated by a space.
Regexp can’t match an undefined number of words, you should know
in advance which number of words you want to match.
For more infor on regexps see e.g. re_format(7).
Hmm… if what you say is true, why does the second poster’s solution
capture multiple words? Wait, I know why. String#scan is different than
string#match. Interesting…
So how does that work if I wanted to match ALL occurrences of \w+
WITHOUT scan?
Well, the regexp always matches the longest possible string.
What did you wrote is effectively equialent to ([a-z]*).
The single regexp can’t match multiple strings, it always matches
one. It can’t match the space after the ‘cost’ either, since this
symbol wasn’t included to your regexp.
In case, if you want to match two words, you should write e.g.
([[:alpha:]]+)[[:space:]]+([[:alpha:]]+)
This regexp will match two words separated by a space.
Regexp can’t match an undefined number of words, you should know
in advance which number of words you want to match.
Is there a book you recommend to learn more about regular expressions?
How did YOU learn them?
“Mastering Regular Expressions” by Jeffrey Friedl. I haven’t seen the
third edition to see if there is any Ruby specific examples but even
with all the Perl examples in the first edition, I still use it as a
reference because of the similarities between Perl and Ruby’s regular
expressions.
word.each_byte do |code|
if code < ?a or code > ?z
good_word = false
break
end
end
if good_word
words << word
end
end
p words
–output:–
[“cost”, “tax”]
That’s clever use of ?a, which I recognize but have never seen anyone
use before. Thanks for the example!
Jim C. wrote:
“Mastering Regular Expressions” by Jeffrey Friedl. I haven’t seen the
third edition to see if there is any Ruby specific examples but even
with all the Perl examples in the first edition, I still use it as a
reference because of the similarities between Perl and Ruby’s regular
expressions.
Except under the upcoming revision (1.9) of the (Ruby) Rules of Golf,
the R(uby)&A(ncient) has outlawed that usage, and instituted the
penalty that ?d will no longer be 100, but “d”.
Except under the upcoming revision (1.9) of the (Ruby) Rules of Golf,
the R(uby)&A(ncient) has outlawed that usage, and instituted the
penalty that ?d will no longer be 100, but “d”.
Well, then the least they can do is add Integer#to as an alias for
Integer#upto so we can have a net loss of 1 character in the above
code