Need a better explanation on this \b

class String
def titleize
self.gsub(/\b\w/) {|letter| letter.upcase}
end
end

puts “Hello i’m bob”.titleize

Question_1: i’m bit confuse how does the \b thing work? i read the book
on peter cooper it says that it separate non-words from words.Can give a
clearer understanding of this \b and also \w might be good as well.


class String
def titleize1
self.gsub(/(\A|\s)\w/) {|letter| letter.upcase}
end
end

puts “Hello i’m Bob”.titleize1

Question_2: i don’t understand the \A thing as well. and within the
(\A|\s)\w too.

I admit I’m not amazing at regular expressions, but with a bit of help
from rubular.com I think I’ve got a handle on this one. If you haven’t
seen it before, rubular is really helpful for figuring out regular
expressions. It has a key of what the different matchers are, and you
can plug in regular expressions and example strings and it will tell you
what matches.

I’m looking at the descriptions at rubular.com and here’s what I see:

\b Any word boundary
\w Any word character (letter, number, underscore)
\A Start of string
\s Any whitespace character

So your first one is going to match the first word character (letter,
number, underscore) after a “word boundary”. I’d guess a word boundary
is white space, but I’m not 100% sure.

Here’s a rubular example showing that regular expression:

As for the second one, it looks like it’s “start of string or
whitespace, followed by a word character”. Here it is in rubular:

I hope that helps!

<3 Pat

Ronald F. wrote in post #1185011:

Say that your string is

'+., -ABC012_:/,'

It contains one word (ABC012_) and two word boundaries (the one between
‘-’ and ‘A’, and the one between ‘_’ and ‘:’.

This boundaries (i.e. substrings of length zero) are matched by a \b

Each of the word characters (‘A’, …, ‘_’) are matched by a \w

Ok i have used your string ‘+., -ABC012_:/,’ at rubular.com.
The results are as follow:
ABC012_

why did they include the _(underscore) to it? ok i kind of get it when
for the basic \b like if the string contains “Hello my name is bob”
then it will break them into like e.g |Hello| |my| |name| |is| |bob|.

Say that your string is

'+., -ABC012_:/,'

It contains one word (ABC012_) and two word boundaries (the one between
‘-’ and ‘A’, and the one between ‘_’ and ‘:’.

This boundaries (i.e. substrings of length zero) are matched by a \b

Each of the word characters (‘A’, …, ‘_’) are matched by a \w

Suresh Ilankovan wrote in post #1185012:

Ronald F. wrote in post #1185011:
why did they include the _(underscore) to it?

Because it is a word character.

Ronald F. wrote in post #1185016:

Suresh Ilankovan wrote in post #1185012:

Ronald F. wrote in post #1185011:
why did they include the _(underscore) to it?

Because it is a word character.

Ok thanks alot i have last question.

if i did this \b\w and upcase it, to a string called “hello hello”

why is the \w only upcase the first letter of each word?
is it because of gsub? i know \w+ take each word.

What to you mean by “upcase it”? What did you upcase? Please provide a
complete example.

\w+ matches one or more word characters.

Suresh Ilankovan wrote in post #1185024:

Ronald F. wrote in post #1185023:
why question is why did it only upcase the first letter of each word?
since \w is also more than one characters right?

No. As I said in my previous answers:

“Each of the word characters (‘A’, …, ‘_’) are matched by a \w”
(Maybe I should have said more clearly “…matched by ONE \w), and "
… \w+ matches one or more word characters.”

Of course, if you match against \w+, there is - in your example - no
point in using \b.

Please study the documentation about the elements in the Ruby regular
expressions.

Ronald F. wrote in post #1185023:

What to you mean by “upcase it”? What did you upcase? Please provide a
complete example.

\w+ matches one or more word characters.

Say for example i have this method


class String
def titleize
self.gsub(/\b\w/) {|letter| letter.upcase}
end
end

puts “Hello i’m bob”.titleize

so the answer would be “Hello I’M Bob”
why question is why did it only upcase the first letter of each word?
since \w is also more than one characters right?