Forum: Ruby Need a better explanation on this \b.

Be955f2578b8971802ceab589fe24ad3?d=identicon&s=25 Suresh Ilankovan (sureshhey)
on 2016-09-14 17:42
class String
  def titleize
    self.gsub(/\b\w/) {|letter| letter.upcase}
  end
end

 puts "Hello i'm bob".titleize

 Question_1: i'm bit confuse how does the \b thing work? i read the book
on peter cooper it says that it separate non-words from words.Can give a
clearer understanding of this \b and also \w might be good as well.

--------------------------------------------------------------------
class String
  def titleize1
    self.gsub(/(\A|\s)\w/) {|letter| letter.upcase}
  end
end

puts "Hello i'm Bob".titleize1

Question_2: i don't understand the \A thing as well. and within the
(\A|\s)\w too.
6b552550673a6a6df3b33364076f8ea8?d=identicon&s=25 Pat Maddox (patmaddox)
on 2016-09-15 02:46
I admit I'm not amazing at regular expressions, but with a bit of help
from rubular.com I think I've got a handle on this one. If you haven't
seen it before, rubular is really helpful for figuring out regular
expressions. It has a key of what the different matchers are, and you
can plug in regular expressions and example strings and it will tell you
what matches.

I'm looking at the descriptions at rubular.com and here's what I see:

\b  Any word boundary
\w  Any word character (letter, number, underscore)
\A  Start of string
\s  Any whitespace character

So your first one is going to match the first word character (letter,
number, underscore) after a "word boundary". I'd guess a word boundary
is white space, but I'm not 100% sure.

Here's a rubular example showing that regular expression:
http://rubular.com/r/tVLEpp2u5J

As for the second one, it looks like it's "start of string or
whitespace, followed by a word character". Here it is in rubular:
http://rubular.com/r/ybK2BYoCNG

I hope that helps!

<3 Pat
0fa73332c8e4a3b06ea439fd3f034322?d=identicon&s=25 Ronald Fischer (rovf)
on 2016-09-15 09:28
Say that your string is

    '+., -ABC012_:/,'

It contains one word (ABC012_) and two word boundaries (the one between
'-' and 'A', and the one between '_' and ':'.

This boundaries (i.e. substrings of length zero) are matched by a \b

Each of the word characters ('A', ...., '_') are matched by a \w
Be955f2578b8971802ceab589fe24ad3?d=identicon&s=25 Suresh Ilankovan (sureshhey)
on 2016-09-15 14:46
Ronald Fischer wrote in post #1185011:
> Say that your string is
>
>     '+., -ABC012_:/,'
>
> It contains one word (ABC012_) and two word boundaries (the one between
> '-' and 'A', and the one between '_' and ':'.
>
> This boundaries (i.e. substrings of length zero) are matched by a \b
>
> Each of the word characters ('A', ...., '_') are matched by a \w

Ok i have used your string '+., -ABC012_:/,' at rubular.com.
The results are as follow:
ABC012_

why did they include the _(underscore) to it? ok i kind of get it when
for the basic \b like if the string contains  "Hello my name is bob"
then it will break them into like e.g |Hello| |my| |name| |is| |bob|.
0fa73332c8e4a3b06ea439fd3f034322?d=identicon&s=25 Ronald Fischer (rovf)
on 2016-09-16 07:49
Suresh Ilankovan wrote in post #1185012:
> Ronald Fischer wrote in post #1185011:
> why did they include the _(underscore) to it?

Because it is a word character.
Be955f2578b8971802ceab589fe24ad3?d=identicon&s=25 Suresh Ilankovan (sureshhey)
on 2016-09-16 14:07
Ronald Fischer wrote in post #1185016:
> Suresh Ilankovan wrote in post #1185012:
>> Ronald Fischer wrote in post #1185011:
>> why did they include the _(underscore) to it?
>
> Because it is a word character.

Ok thanks alot i have last question.


if i did this \b\w and upcase it, to a string called "hello hello"

why is the \w only upcase the first letter of each word?
is it because of gsub? i know \w+ take each word.
0fa73332c8e4a3b06ea439fd3f034322?d=identicon&s=25 Ronald Fischer (rovf)
on 2016-09-16 14:22
What to you mean by "upcase it"? What did you upcase? Please provide a
complete example.

\w+ matches one or more word characters.
Be955f2578b8971802ceab589fe24ad3?d=identicon&s=25 Suresh Ilankovan (sureshhey)
on 2016-09-16 16:05
Ronald Fischer wrote in post #1185023:
> What to you mean by "upcase it"? What did you upcase? Please provide a
> complete example.
>
> \w+ matches one or more word characters.


Say for example i have this method

-----------------------------------------------
class String
  def titleize
    self.gsub(/\b\w/) {|letter| letter.upcase}
  end
end

 puts "Hello i'm bob".titleize
----------------------------------------------

so the answer would be "Hello I'M Bob"
why question is why did it only upcase the first letter of each word?
since \w is also more than one characters right?
0fa73332c8e4a3b06ea439fd3f034322?d=identicon&s=25 Ronald Fischer (rovf)
on 2016-09-16 16:27
Suresh Ilankovan wrote in post #1185024:
> Ronald Fischer wrote in post #1185023:
> why question is why did it only upcase the first letter of each word?
> since \w is also more than one characters right?

No. As I said in my previous answers:

"Each of the word characters ('A', ...., '_') are matched by a \w"
(Maybe I should have said more clearly "..matched by ONE \w), and "
.... \w+ matches one or more word characters."

Of course, if you match against \w+, there is - in your example - no
point in using \b.

Please study the documentation about the elements in the Ruby regular
expressions.
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.