Forum: Ruby Issue with regexp pattern matcher withing String#gsub

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Cf8a610127d1108dfe67f673320b5fe5?d=identicon&s=25 Craig Jolicoeur (craigpj)
on 2009-04-29 04:03
I'm having a strange issue I can't wrap my head around.  I've posted the
full details as a gist here: http://gist.github.com/103523 but I will
also paste the same contents here for everyones benefit.

## The pattern matcher in both the following will match either the
single or double quote
## at the start of the src= attribute (this is what is available in $1

## this regexp replacement works fine and will convert
## src="/foo/bar.png"
## to
## src="/RAILS_ROOT/public/foo/bar.png"

html_string.gsub!( /src=(["'])[\.\/]/ ) { |m| 'src=' + $1 +
"#{RAILS_ROOT}/public/" }

## this regexp errors with the following error message
## TypeError: can't convert nil into String
##  from (irb):84:in `+'
##  from (irb):84
##  from (irb):84:in `gsub'
##  from (irb):84

html_string.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first +
$1 }

## the question is why can I do string concat with + $1 in the first
regexp and
## why am I unable to do it in the second regexp where doing str + $1
throws the error??
Cf8a610127d1108dfe67f673320b5fe5?d=identicon&s=25 Craig Jolicoeur (craigpj)
on 2009-04-29 04:33
I'm just changed up how I'm doing the matching and am going with this
instead

html_string.gsub!( /src=["'](\S+\?\d*)["']/i ) { |m| 'src="' +
$1.split('?').first + '"' }
54404bcac0f45bf1c8e8b827cd9bb709?d=identicon&s=25 7stud -- (7stud)
on 2009-04-29 04:37
Craig Jolicoeur wrote:
> ## the question is why can I do string concat with + $1 in the first
> regexp and
> ## why am I unable to do it in the second regexp where doing str + $1
> throws the error??

There's no match with the second regex, so all of the regex $ variables
are set to nil.  Your second regex requires that the character "?" be in
the string.  It isn't.
54404bcac0f45bf1c8e8b827cd9bb709?d=identicon&s=25 7stud -- (7stud)
on 2009-04-29 04:59
...and that means the error message you posted can't occur.   When I run
the code:

    str = %q{src="/foo/bar.png"}

    str.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first + $1 }

there are no errors and no output.  So you must have done something
differently in your code.
666b4e17b4bb0e2d999037a25f65a7cb?d=identicon&s=25 Heesob Park (phasis)
on 2009-04-29 05:25
(Received via mailing list)
2009/4/29 Craig Jolicoeur <cpjolicoeur@gmail.com>:
> ## to
> ##  from (irb):84
>
> html_string.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first +
> $1 }
>
> ## the question is why can I do string concat with + $1 in the first
> regexp and
> ## why am I unable to do it in the second regexp where doing str + $1
> throws the error??
The side effect of String#split is changing the value of $1.

You can work acound like this:
html_string.gsub!( /src=(["'])\S+\?\d*\1/ ) {|s| t=$1;s.split('?').first
+ t }

Regards,

Park Heesob
Cf8a610127d1108dfe67f673320b5fe5?d=identicon&s=25 Craig Jolicoeur (craigpj)
on 2009-04-29 15:16
Heesob Park wrote:
> 2009/4/29 Craig Jolicoeur <cpjolicoeur@gmail.com>:
>> ## to
>> ##  from (irb):84
>>
>> html_string.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first +
>> $1 }
>>
>> ## the question is why can I do string concat with + $1 in the first
>> regexp and
>> ## why am I unable to do it in the second regexp where doing str + $1
>> throws the error??
> The side effect of String#split is changing the value of $1.
>
> You can work acound like this:
> html_string.gsub!( /src=(["'])\S+\?\d*\1/ ) {|s| t=$1;s.split('?').first
> + t }
>
> Regards,
>
> Park Heesob

ahh, interesting.  thanks so much.  I didn't realize that #split would
overwrite those methods.

What exactly does the split function set them to?
Cf8a610127d1108dfe67f673320b5fe5?d=identicon&s=25 Craig Jolicoeur (craigpj)
on 2009-04-29 15:18
7stud -- wrote:
> ...and that means the error message you posted can't occur.   When I run
> the code:
>
>     str = %q{src="/foo/bar.png"}
>
>     str.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first + $1 }
>
> there are no errors and no output.  So you must have done something
> differently in your code.

you don't get an error because your initial quote doesnt make the match.

try it with this instead

    str = %q{src="/foo/bar.png?12345"}

you will get the error in that case since the regexp pattern is found.
the pattern wasn't found in your test
753dcb78b3a3651127665da4bed3c782?d=identicon&s=25 Brian Candler (candlerb)
on 2009-04-29 15:21
Heesob Park wrote:
> 2009/4/29 Craig Jolicoeur <cpjolicoeur@gmail.com>:
>> ## to
>> ##  from (irb):84
>>
>> html_string.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first +
>> $1 }
>>
>> ## the question is why can I do string concat with + $1 in the first
>> regexp and
>> ## why am I unable to do it in the second regexp where doing str + $1
>> throws the error??
> The side effect of String#split is changing the value of $1.

It doesn't in ruby 1.8.6:

irb(main):007:0> /(.)/ =~ "abc"; puts $1; "a b".split(" "); puts $1
a
a
=> nil

It will if the split is done on a Regexp (but the OP wasn't doing that)

irb(main):008:0> /(.)/ =~ "abc"; puts $1; "a b".split(/ /); puts $1
a
nil
=> nil
666b4e17b4bb0e2d999037a25f65a7cb?d=identicon&s=25 Heesob Park (phasis)
on 2009-04-29 15:29
(Received via mailing list)
2009/4/29 Brian Candler <b.candler@pobox.com>:
>>> ## why am I unable to do it in the second regexp where doing str + $1
> It will if the split is done on a Regexp (but the OP wasn't doing that)
>
> irb(main):008:0> /(.)/ =~ "abc"; puts $1; "a b".split(/ /); puts $1
> a
> nil
> => nil
It does in ruby 1.8.6
irb(main):007:0> /(.)/ =~ "abc"; puts $1; "a?b".split("?"); puts $1
a
nil
=> nil

Regards,

Park Heesob
54404bcac0f45bf1c8e8b827cd9bb709?d=identicon&s=25 7stud -- (7stud)
on 2009-04-29 16:57
Craig Jolicoeur wrote:
> 7stud -- wrote:
>> ...and that means the error message you posted can't occur.   When I run
>> the code:
>>
>>     str = %q{src="/foo/bar.png"}
>>
>>     str.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first + $1 }
>>
>> there are no errors and no output.  So you must have done something
>> differently in your code.
>
> you don't get an error because your initial quote doesnt make the match.
>

I used the string you provided.  Generally, if you want relevant help,
you should post relevant data.

>> The side effect of String#split is changing the value of $1.
>
> It doesn't in ruby 1.8.6:
>

/(.)/ =~ "abc"
puts $1

"acb".split(" ")
puts $1

--output:--
a
a

> It will if the split is done on a Regexp (but the OP wasn't doing that)
>
> irb(main):008:0> /(.)/ =~ "abc"; puts $1; "a b".split(/ /); puts $1
> a
> nil

My tests show that every string but " " causes split() to change the
value of $1.  The docs say that if the pattern given to split() is a
single space, the string will be split on whitespace with runs of
whitespace ignored, for example:

result = "a       b".split(" ")
p result

--output:--
["a", "b"]

So a string containing a single space is treated differently than other
strings by split().  Why that would cause a different result for the
value of $1, I have no idea.


> ahh, interesting.  thanks so much.  I didn't realize that #split would
> overwrite those [global variables].
>
> What exactly does the split function set them to?

"abcd" =~ /a(b)(c)d/
puts $~, $1, $2

--output:--
abcd
b
c

#same program:
"AAAxyzBBB".split(/x(y)z/)
puts $~, $1, $2

--output:--
nil
nil
nil

$1 is supposed to be the string that matches the first grouping of a
successful match
This topic is locked and can not be replied to.