'\\\\\\\\' madness?

fstroehle · September 7, 2009, 10:40am

hi,
trying to solve someones mailinglist posted problem, I just created my
own.
Why do 2 backslashes produce the same result as 4 and 6 the same as 8?

014:0> ‘ab’.gsub(‘a’, ‘\’)
=> “\b”
015:0> ‘ab’.gsub(‘a’, ‘\\’)
=> “\b”
016:0> ‘ab’.gsub(‘a’, ‘\\\’)
=> “\\b”
017:0> ‘ab’.gsub(‘a’, ‘\\\\’)
=> “\\b”
018:0> RUBY_DESCRIPTION
=> “ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-linux]”

Doesn’t that strike anyone as odd?
Greetz!

fstroehle · September 7, 2009, 11:22am

On Mon, 2009-09-07 at 17:36 +0900, Fabian S. wrote:

017:0> ‘ab’.gsub(‘a’, ‘\\\\’)
=> “\\b”
018:0> RUBY_DESCRIPTION
=> “ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-linux]”

Doesn’t that strike anyone as odd?
Greetz!

Be careful with gsub, it does a lot of magic. From the documentation:

If a string is used as the replacement, special variables from the
match (such as +$&+ and +$1+) cannot be substituted into it, as
substitution into the string occurs before the pattern match
starts. However, the sequences +\1+, +\2+, and so on may be used to
interpolate successive groups in the match.

To fix, put the replacement into a block:

irb(main):001:0> s = ‘\’
=> “\”
irb(main):002:0> s.size
=> 1
irb(main):003:0> s = s * 10
=> “\\\\\\\\\\”
irb(main):004:0> s.size
=> 10
irb(main):005:0> ‘a’.gsub(‘a’, s).size
=> 5
irb(main):006:0> ‘a’.gsub(‘a’) { s }.size
=> 10
irb(main):007:0> ‘a’.gsub(‘a’) { s }
=> “\\\\\\\\\\”
irb(main):008:0> ‘a’.gsub(‘a’, s)
=> “\\\\\”

Martin

fstroehle · September 7, 2009, 4:24pm

2009/9/7 Fabian S. [email protected]:

017:0> ‘ab’.gsub(‘a’, ‘\\\\’)
=> “\\b”
018:0> RUBY_DESCRIPTION
=> “ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-linux]”

Doesn’t that strike anyone as odd?

Well, not me anyway. You can find a lot of threads about this in
the archives… (hint, hint)

The short story: you need to keep in mind that there are multiple
levels of escaping going on which unfortunately use the same character
("") as meta character AND output in IRB is done via #inspect.

To get a single backslash in a string you need to type two:

irb(main):001:0> puts ‘\’

=> nil

In order to make gsub use a backslash literally (and not interpret it
as meta character) you need to have two of them in the string:

irb(main):002:0> puts ‘\\’
\
=> nil

Now, there are some corner cases: your first example uses a
replacement string with just a single backslash so it looses its
metaness (because there is nothing behind it to escape). This is true
for all cases where there is nothing behind it which can be escaped:

irb(main):004:0> puts ‘\1’, ‘\1’
\1
\1
=> nil

HTH

Kind regards

robert

fstroehle · September 8, 2009, 1:29pm

Well, not me anyway. You can find a lot of threads about this in
the archives… (hint, hint)

ah, yeah, there was something… I guess I’m gonna hand myself a
quick
JFGI here…
sry…

well… thanks anyways, I think I get it now. that magic is reaaaally
mean.
Greetz!