Surprise in sub

irb(main):001:0> s = “\\”
=> “\\”
irb(main):002:0> s.length
=> 2
irb(main):003:0> s = “howdy”.sub(“howdy”, s)
=> “\”
irb(main):004:0> s.length
=> 1

So merely using a string as the second param of sub (the replacement
value) can cause that string to be altered.

Now, the documentation does “warn” that sequences \1, \2 etc. are valid
in the replacement string. This suggests that the replacement string is
processed before use; to be sure, it says nothing about “\” explicitly,
but I do see of course that one must deal with “\” in order to escape
the escaping. Furthermore, there’s a “workaround”, namely to write the
third line as follows:

s = “howdy”.sub(“howdy”) {|x| s}

Still, I got seriously caught by this behavior and it was tricky to
track down. m.

On Apr 11, 9:15 am, [email protected] (matt neuburg) wrote:

value) can cause that string to be altered.
Still, I got seriously caught by this behavior and it was tricky to
track down. m.


matt neuburg, phd = [email protected],Matt Neuburg’s Home Page
Leopard -http://www.takecontrolbooks.com/leopard-customizing.html
AppleScript -Amazon.com
Read TidBITS! It’s free and smart.http://www.tidbits.com

s is changing because you assigned to it, not because of using it
as the second parameter of sub(). Try assigning the result to a
different variable like so:

ss = “howdy”.sub(“howdy”, s)

x17y19 [email protected] wrote:

So merely using a string as the second param of sub (the replacement

as the second parameter of sub(). Try assigning the result to a
different variable like so:

ss = “howdy”.sub(“howdy”, s)

You’re missing the point… m.

From: matt neuburg [mailto:[email protected]]

You’re missing the point… m.

i think i missed the point too :slight_smile:

you did mention: “So merely using a string as the second param of sub
(the replacement value) can cause that string to be altered.”…

kind regards -botp

Hi,

On Sat, Apr 12, 2008 at 2:20 AM, matt neuburg [email protected] wrote:

irb(main):001:0> s = “\\”
=> “\\”
irb(main):002:0> s.length
=> 2
irb(main):003:0> s = “howdy”.sub(“howdy”, s)
=> “\”
irb(main):004:0> s.length
=> 1

Yeah, escaping and escaping-of-escaping with substition and Strings
being
used as a poor-man’s-regexp always catches me out. Thanks for the heads
up
on this one.

Arlen

On Fri, Apr 11, 2008 at 9:20 AM, matt neuburg [email protected] wrote:

value) can cause that string to be altered.
Nope, using the string (s) as the second parameter of sub did nothing to
alter
it. This is clear if you use a different variable as the assignment
target:

irb(main):001:0> s=‘\\’
=> “\\”
irb(main):002:0> s.length
=> 2
irb(main):003:0> foo = “howdy”.sub(“howdy”,s)
=> “\”
irb(main):004:0> s
=> “\\”
irb(main):005:0> s.length
=> 2
irb(main):006:0> foo
=> “\”
irb(main):007:0> foo.length
=> 1

s isn’t changed by being used as the second argument to sub, instead,
the
string sent as the second argument to sub is processed for escape
sequences
so that the substring ‘\’ occurring in that string is treated as a
single literal ''
when used in the replacement.

But its not changed, as the above irb session shows. s is unmodified.

On Sat, Apr 12, 2008 at 9:25 AM, Christopher D. [email protected]
wrote:

irb(main):002:0> s.length
irb(main):007:0> foo.length
=> 1

s isn’t changed by being used as the second argument to sub, instead, the
string sent as the second argument to sub is processed for escape sequences
so that the substring ‘\’ occurring in that string is treated as a
single literal ''
when used in the replacement.

But its not changed, as the above irb session shows. s is unmodified.

When I first read the post, I immediately wanted to strike out with a
“well, you’re assigning” response.

I’m not sure, but I think the OP was referring to what you said;
namely, how the escaping happens before subbing.

Todd

Christopher D. [email protected] wrote:

So merely using a string as the second param of sub (the replacement
value) can cause that string to be altered.

Nope, using the string (s) as the second parameter of sub did nothing to alter
it.

I didn’t say that s was altered. I said that the string you provide as
the second param of sub might not be the string that gets substituted in

  • as the example demonstrates. If you don’t find this counterintuitive,
    you don’t; great. But some people might. Those are the people I’m trying
    to help here. m.

From: matt neuburg [mailto:[email protected]]

I didn’t say that s was altered. I said that the string you

provide as the second param of sub might not be the string

that gets substituted in - as the example demonstrates. If

you don’t find this counterintuitive, you don’t; great. But

some people might. Those are the people I’m trying

to help here. m.

i think the confusion stems fr the fact that sub/gsub has to
reprocess/unescape the string twice

1 for the string as usual for possible escaping chars like \ and "

and

2 for the group references like \1

note that this behaviour is present in other languages too.

it’s been a long time i have *not used the string(as 2nd param) form. I
have been getting used to w the block form since not only does it
handles the double escaping issue/confusion but it also caters the match
vars $1, $`, $& among others…

so, this one eg

irb(main):019:0> “hello”.gsub(/([aeiou])/, “<\1>”)
=> “hll”

now becomes this

irb(main):020:0> “hello”.gsub(/([aeiou])/) {|s| “<#{s}>”}
=> “hll”

or this

irb(main):026:0> “hello”.gsub(/([aeiou])/) {“<#$1>”}
=> “hll”

your choice though.

kind regards -botp