Gsub("\\", "\\\\") seems unintuitive

The following confusing behavior is noted in the pickaxe book (2nd ed)
on page 75:

I would expect two backslashes in the result

irb> puts “\”.gsub("\","\\")
\

I would expect four backslashes in the result

irb> puts “\”.gsub("\","\\\\")
\

I can certainly work around it, but it seems unintuitive. Is there a
reason why gsub behaves this way? Just curious…

John W. wrote:

The following confusing behavior is noted in the pickaxe book (2nd ed)
on page 75:

I would expect two backslashes in the result

irb> puts “\”.gsub("\","\\")
\

I would expect four backslashes in the result

irb> puts “\”.gsub("\","\\\\")
\

I can certainly work around it, but it seems unintuitive. Is there a
reason why gsub behaves this way? Just curious…

It’s not a gsub thing, per se–it’s a string thing. Backslashes are used
in strings to escape special characters. One such character is a " mark.
If you want to write a " mark in the middle of a string, you have to
escape it with a backslash:
“They call me “Mellow Yellow” etc.”

If you didn’t, then the " would signify the end of the string!
Similarly, in the example you listed, if you just did:
“”
then you end up with a string that ISN’T ended! Because you escaped the
next ". So, if you want a literal backslash, you have to escape the
backslash too: “\”

It just looks confusing because you are escaping the escape character :slight_smile:

On Feb 22, 11:27 am, John W. [email protected] wrote:

I can certainly work around it, but it seems unintuitive. Is there a
reason why gsub behaves this way? Just curious…

puts “\”.gsub(“\”){“\\”}

On Feb 22, 2008, at 12:27 PM, John W. wrote:

I can certainly work around it, but it seems unintuitive. Is there a
reason why gsub behaves this way? Just curious…

Notwithstanding the earlier responses…

Since the replacement string is evaluated ‘twice’, once as a ruby
string literal and then again by gsub to look for group refrences like
‘\1’, you need to provide two levels of escaping for a backslash.

\ is “\”
so two of them is “\\”
and you want gsub to see that so it need to have them escaped: “\\\
\”

Whew! Yeah, it’s unfortunate, but backslash is doing double-duty
here: introducing a group reference to the regular expression and
escaping characters in a string literal (just like “\n”, but also
itself).

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

Just as a side note, this is typical in all real programming languages.
C, C++, Java, Perl, sh, etc.
I believe it’s also true in python, lithp/scheme, & (o)caml, but for
those I’ve either not used them, or not used them in so long I’m
unsure.

Some languages, like vb{6|script|.net}, use a doubled quote, but those
aren’t really proper programing languages :wink:

–Kyle

Thanks Rob, that’s exactly what I was missing – the second round of
escaping is necessary to make escaped references to regex groups work.

I stumbled onto this thread trying to figure out a similar problem…

Joe P. wrote:

I stumbled onto this thread trying to figure out a similar problem…

Dangit, I didn’t mean to post the above message :slight_smile:

Okay, I need to read a site name (like “joe.com”) and put a \ in front
of the period (so it would be “joe.com”). I’m having trouble with
this, I can’t seem to get only ONE \ in front of the period…

joe.com”.gsub(/./, “\.”)
=> “joe\.com”
joe.com”.gsub(/./, “\\.”)
=> “joe\.com”

Anyone have a clue what to do? I’m probably missing something simple.

On Wed, Dec 17, 2008 at 2:12 PM, Joe P. [email protected] wrote:

=> “joe\.com”

joe.com”.gsub(/./, “\\.”)
=> “joe\.com”

Anyone have a clue what to do? I’m probably missing something simple.

Posted via http://www.ruby-forum.com/.

That’s because you are in irb. You are getting what you want try this:

puts “joe.com”.gsub(/./, “\\.”)

Actually it might be more readable as gsub(‘.’, ‘.’)


“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

Okay, those last two explanations helped clear up my confusion. Thanks.

On Wed, Dec 17, 2008 at 3:12 PM, Joe P. [email protected] wrote:

=> “joe\.com”

joe.com”.gsub(/./, “\\.”)
=> “joe\.com”

Anyone have a clue what to do? I’m probably missing something simple.

You are looking at a string with escaping characters (i.e. what it
would look like in double quotes). You can verify this by using #puts
or “joe.com”.gsub(‘.’, ‘.’).length.

hth,
Todd