Forum: Ruby Odd behavior of String#scan

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Warren B. (Guest)
on 2005-12-21 23:35
(Received via mailing list)
First off, the problem I am trying to solve can be simplified down
to:

'abcSTARTdef,ghi,jkl,ENDmno'.scan(/START([^,]*,)*END/)

    What I want is [["def,"], ["ghi,"], ["jkl,"]] (or the same thing
without the commas), and I still need a way to achieve this.  I can
accomplish it with:

'abcSTARTdef,ghi,jkl,ENDmno'.scan(/START(.*)END/)[0][0].split(/,/)

    But this does two operations where it seems like one should suffice.
Does someone know of a way to do this in a single operation?


    Back to the odd behavior, the first expression actually returns
[["jkl,"]].  I can't figure out how that is the correct answer by any
reasonable definition of "scan".  However, the equivalent String#match
does the same kind of thing, so I must be missing something.  Can
someone please explain this behavior?


    Thanks,

    - Warren B.
Bob S. (Guest)
on 2005-12-21 23:50
(Received via mailing list)
Warren B. wrote:
>
>     But this does two operations where it seems like one should suffice.
> Does someone know of a way to do this in a single operation?

I would suggest:

   'abcSTARTdef,ghi,jkl,ENDmno'.match(/START(.*)END/)[1].split(',')

I don't know of a way to accomplish it in one step. String#scan attempts
to match the whole regex at multiple places within the string; but you
need the START and END to delimit the substring over which scan
operates.

>
>
>     Back to the odd behavior, the first expression actually returns
> [["jkl,"]].  I can't figure out how that is the correct answer by any
> reasonable definition of "scan".  However, the equivalent String#match
> does the same kind of thing, so I must be missing something.  Can
> someone please explain this behavior?

Because you have this:

   ([^,]*,)*

The final * allows the group to match multiple times. The MatchData will
  hold only the last match however, which is "jkl,".
Dan D. (Guest)
on 2005-12-22 00:32
(Received via mailing list)
Will this do?

  s="abcSTARTdef,ghi,jkl,ENDmno"
  s.scan(/START(.*)END/).to_s.split(",")

  => ["def", "ghi", "jkl"]
This topic is locked and can not be replied to.