Problem with regexp split

I am trying to split some text into an array seperated by one or more

Here is some test code:

s = “one
two

three


four”
p s.split(/(
)+/);

it should split into [“one”,“two”,“three”,“four”] because the /
(
)+/ pattern should use one or more
as the pattern to split
around

but it does this
[“one”, “
”, “two”, “
”, “three”]

Why does it do this and what split could I use to get it to work?

Note:, I know that I could just fix it by removeing the
lines
after it is done from the array, but it seems that the regular
expression in split should work.

On Mar 1, 2010, at 9:50 AM, [email protected] wrote:

around

but it does this
[“one”, “
”, “two”, “
”, “three”]

Why does it do this and what split could I use to get it to work?

Note:, I know that I could just fix it by removeing the
lines
after it is done from the array, but it seems that the regular
expression in split should work.

Interesting. Docs say:

If pattern is a String, then its contents are used as the delimiter
when splitting str. If pattern is a single space, str is split on
whitespace, with leading whitespace and runs of contiguous whitespace
characters ignored.

If pattern is a Regexp, str is divided where the pattern matches.
Whenever the pattern matches a zero-length string, str is split into
individual characters.

Which seems to be saying exactly what you are are describing. If a
regexp is used the match isn’t “eaten”, but simply divided on.

You could split it on “
” and then remove any blank elements… not
sure if that’s any better than your alternative approach though.

yea, I have been using reg exp and ruby for years. and this is a
puzzle.

Also does not behave with this code:

s = “onexytwoxyxythreexyxyxyfour”
p s.split(/(xy)+/)

array.compact.reject { |i| i.nil? or i.empty? } seemed to leave some
unwanted elements, at least on my Ruby 1.8.6.

But array.delete_if { |i| i.nil? or i.empty? } worked as expected on
my machine.

HTH,
Richard

array.delete_if { |i| i.nil? or i.empty? }

On Mon, Mar 1, 2010 at 9:50 AM, [email protected] <
[email protected]> wrote:

around

Gerry, you can do the following:

p s.gsub(/
/, " " ).split

Good luck,

-Conrad

[email protected] wrote:

Also does not behave with this code:

s = “onexytwoxyxythreexyxyxyfour”
p s.split(/(xy)+/)

Try this:

s = “one
two

three


four”
array = s.split(’
’)
array.compact.reject { |i| i.nil? or i.empty? }

This will produce:

[‘one’, ‘two’, ‘three’, ‘four’ ]

Regards,

Atc.,
Kirk Patrick

On Mar 1, 1:00 pm, Philip H. [email protected] wrote:

Why does it do this and what split could I use to get it to work?
characters ignored.

If pattern is a Regexp, str is divided where the pattern matches.
Whenever the pattern matches a zero-length string, str is split into
individual characters.

Which seems to be saying exactly what you are are describing. If a
regexp is used the match isn’t “eaten”, but simply divided on.

You could split it on “
” and then remove any blank elements… not
sure if that’s any better than your alternative approach though.

The trick here is a feature inherited from Perl - groups (in parens)
in the regexp cause the delimiters to be included. This works like
you’d expect:

s.split(/(?:
)+/)

the ?: modifier tells the parens to group without providing a backref.

–Matt J.

RichardOnRails wrote:

array.compact.reject { |i| i.nil? or i.empty? } seemed to leave some
unwanted elements, at least on my Ruby 1.8.6.

But array.delete_if { |i| i.nil? or i.empty? } worked as expected on
my machine.

HTH,
Richard

array.delete_if { |i| i.nil? or i.empty? }

My Ruby version is 1.8.7
But the more important is the problem solved. =P

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs