Issue with regexp pattern matcher withing String#gsub


#1

I’m having a strange issue I can’t wrap my head around. I’ve posted the
full details as a gist here: http://gist.github.com/103523 but I will
also paste the same contents here for everyones benefit.

The pattern matcher in both the following will match either the

single or double quote

at the start of the src= attribute (this is what is available in $1

this regexp replacement works fine and will convert

src="/foo/bar.png"

to

src="/RAILS_ROOT/public/foo/bar.png"

html_string.gsub!( /src=(["’])[./]/ ) { |m| ‘src=’ + $1 +
“#{RAILS_ROOT}/public/” }

this regexp errors with the following error message

TypeError: can’t convert nil into String

from (irb):84:in `+’

from (irb):84

from (irb):84:in `gsub’

from (irb):84

html_string.gsub!( /src=(["’])\S+?\d*\1/ ) { |s| s.split(’?’).first +
$1 }

the question is why can I do string concat with + $1 in the first

regexp and

why am I unable to do it in the second regexp where doing str + $1

throws the error??


#2

I’m just changed up how I’m doing the matching and am going with this
instead

html_string.gsub!( /src="’["’]/i ) { |m| ‘src="’ +
$1.split(’?’).first + ‘"’ }


#3

Craig J. wrote:

the question is why can I do string concat with + $1 in the first

regexp and

why am I unable to do it in the second regexp where doing str + $1

throws the error??

There’s no match with the second regex, so all of the regex $ variables
are set to nil. Your second regex requires that the character “?” be in
the string. It isn’t.


#4

…and that means the error message you posted can’t occur. When I run
the code:

str = %q{src="/foo/bar.png"}

str.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first + $1 }

there are no errors and no output. So you must have done something
differently in your code.


#5

Heesob P. wrote:

2009/4/29 Craig J. removed_email_address@domain.invalid:

to

 from (irb):84

html_string.gsub!( /src=(["’])\S+?\d*\1/ ) { |s| s.split(’?’).first +
$1 }

the question is why can I do string concat with + $1 in the first

regexp and

why am I unable to do it in the second regexp where doing str + $1

throws the error??
The side effect of String#split is changing the value of $1.

You can work acound like this:
html_string.gsub!( /src=(["’])\S+?\d*\1/ ) {|s| t=$1;s.split(’?’).first

  • t }

Regards,

Park H.

ahh, interesting. thanks so much. I didn’t realize that #split would
overwrite those methods.

What exactly does the split function set them to?


#6

7stud – wrote:

…and that means the error message you posted can’t occur. When I run
the code:

str = %q{src="/foo/bar.png"}

str.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first + $1 }

there are no errors and no output. So you must have done something
differently in your code.

you don’t get an error because your initial quote doesnt make the match.

try it with this instead

str = %q{src="/foo/bar.png?12345"}

you will get the error in that case since the regexp pattern is found.
the pattern wasn’t found in your test


#7

2009/4/29 Craig J. removed_email_address@domain.invalid:

to

 from (irb):84

html_string.gsub!( /src=(["’])\S+?\d*\1/ ) { |s| s.split(’?’).first +
$1 }

the question is why can I do string concat with + $1 in the first

regexp and

why am I unable to do it in the second regexp where doing str + $1

throws the error??
The side effect of String#split is changing the value of $1.

You can work acound like this:
html_string.gsub!( /src=(["’])\S+?\d*\1/ ) {|s| t=$1;s.split(’?’).first

  • t }

Regards,

Park H.


#8

Heesob P. wrote:

2009/4/29 Craig J. removed_email_address@domain.invalid:

to

 from (irb):84

html_string.gsub!( /src=(["’])\S+?\d*\1/ ) { |s| s.split(’?’).first +
$1 }

the question is why can I do string concat with + $1 in the first

regexp and

why am I unable to do it in the second regexp where doing str + $1

throws the error??
The side effect of String#split is changing the value of $1.

It doesn’t in ruby 1.8.6:

irb(main):007:0> /(.)/ =~ “abc”; puts $1; “a b”.split(" "); puts $1
a
a
=> nil

It will if the split is done on a Regexp (but the OP wasn’t doing that)

irb(main):008:0> /(.)/ =~ “abc”; puts $1; “a b”.split(/ /); puts $1
a
nil
=> nil


#9

2009/4/29 Brian C. removed_email_address@domain.invalid:

why am I unable to do it in the second regexp where doing str + $1

It will if the split is done on a Regexp (but the OP wasn’t doing that)

irb(main):008:0> /(.)/ =~ “abc”; puts $1; “a b”.split(/ /); puts $1
a
nil
=> nil
It does in ruby 1.8.6
irb(main):007:0> /(.)/ =~ “abc”; puts $1; “a?b”.split("?"); puts $1
a
nil
=> nil

Regards,

Park H.


#10

Craig J. wrote:

7stud – wrote:

…and that means the error message you posted can’t occur. When I run
the code:

str = %q{src="/foo/bar.png"}

str.gsub!( /src=(["'])\S+\?\d*\1/ ) { |s| s.split('?').first + $1 }

there are no errors and no output. So you must have done something
differently in your code.

you don’t get an error because your initial quote doesnt make the match.

I used the string you provided. Generally, if you want relevant help,
you should post relevant data.

The side effect of String#split is changing the value of $1.

It doesn’t in ruby 1.8.6:

/(.)/ =~ “abc”
puts $1

“acb”.split(" ")
puts $1

–output:–
a
a

It will if the split is done on a Regexp (but the OP wasn’t doing that)

irb(main):008:0> /(.)/ =~ “abc”; puts $1; “a b”.split(/ /); puts $1
a
nil

My tests show that every string but " " causes split() to change the
value of $1. The docs say that if the pattern given to split() is a
single space, the string will be split on whitespace with runs of
whitespace ignored, for example:

result = “a b”.split(" ")
p result

–output:–
[“a”, “b”]

So a string containing a single space is treated differently than other
strings by split(). Why that would cause a different result for the
value of $1, I have no idea.

ahh, interesting. thanks so much. I didn’t realize that #split would
overwrite those [global variables].

What exactly does the split function set them to?

“abcd” =~ /a(b)©d/
puts $~, $1, $2

–output:–
abcd
b
c

#same program:
“AAAxyzBBB”.split(/x(y)z/)
puts $~, $1, $2

–output:–
nil
nil
nil

$1 is supposed to be the string that matches the first grouping of a
successful match