Are MatchData objects mutable?


#1

A few times in my programs I’ve tried to modify elements of the array
returned by .match, and it always causes strange errors later in my
code.

For example, if I do:

s = “aaa bbb ccc”
m = /(.) (.) (.*)/.match(s)
m[1].sub! ( “aa” , “” )

I get strange results, but if I instead do:

s = “aaa bbb ccc”
m = /(.) (.) (.*)/.match(s)
x = m[1].dup
x.sub! ( “aa” , “” )

it works fine. What’s going on?

Mike S.


#2

Sorry, but now I’m completely confused. Arrays don’t have a .sub!
method,
and m[1] is a string. And I can reference m[1], m[2], etc. in any order
and
get the same strings each time.

Why are the object id’s of m[1] different each time you reference it (in
your example)?

Mike S.


#3

Hi –

On Sun, 27 May 2007, Mike S. wrote:

s = “aaa bbb ccc”
m = /(.) (.) (.*)/.match(s)
x = m[1].dup
x.sub! ( “aa” , “” )

it works fine. What’s going on?

Looking at the source (fairly quickly, but I think this is what’s
happening), it looks like m[1] calls match_aref, which calls
rb_reg_nth_match, which hands back a different string object each
time. So doing a sub! on it doesn’t have any impact on the string you
get the next time you do the [1] operation.

You can also see this like this, where each [1] string is a different
object:

irb(main):006:0> m = /(abc)/.match(“abc”)
=> #MatchData:0x34bf98
irb(main):007:0> m[1].object_id
=> 1713410
irb(main):008:0> m[1].object_id
=> 1698110
irb(main):009:0> m[1].object_id
=> 1684420

etc.

David


#4

Now I get it. Thanks!

Mike S.


#5

On Thu, May 31, 2007 at 03:06:06AM +0900, Mike S. wrote:

Sorry, but now I’m completely confused. Arrays don’t have a .sub! method,
and m[1] is a string. And I can reference m[1], m[2], etc. in any order and
get the same strings each time.

Why are the object id’s of m[1] different each time you reference it (in
your example)?

Because m isn’t an Array, it’s a MatchData object.

$ irb1.8
irb(main):001:0> m = /(b+)/.match(“abbbc”)
=> #MatchData:0xb7d40020

MatchData has a [] method, but that doesn’t mean it has to behave like
an
Array. If m[1] were to return the same string each time, it would have
to
cache its own answers, or build an array containing all the captures
(which
would be wasteful in those cases where they’re not used).

Regards,

Brian.