DRY gsub

On 1/12/07, Josselin [email protected] wrote:

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

  d = d.gsub(/\r\n/,' ')   # get rid of carriage return
  d = d.gsub(/;/,' ')  # replace column by space
  d = d.gsub(/,/,' ') # replace comma by space
  a  = d.split(' ')   # split into component , space as divider

tfyl

How about a = d.gsub(/\r\n|;|,/,’ ‘).split(’ ') ?

Josselin wrote:

Joss

I would probably go with

a = d.chomp.split(/[\s,;]/)

Best regards,
Henrik S.

On 1/12/07, Josselin [email protected] wrote:

Joss

I know you got lots of answers but what about

a = d.gsub(/;|,/," ").split

If I am not mistaken it will work on Unix too.

HTH
Robert

On Jan 12, 2007, at 11:35 AM, Robert D. wrote:

tfyl

Joss

I know you got lots of answers but what about

a = d.gsub(/;|,/," ").split

No need for a Regexp there:

a = d.tr(";,", " ").split

James Edward G. II

[email protected] wrote:

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc, honest!

It says, “Please don’t capture the stuff in parentheses here, because
that changes what split returns”.

irb(main):001:0> s = “a b,c”
=> “a b,c”
irb(main):002:0> s.split( /( |,)/ )
=> [“a”, " ", “b”, “,”, “c”]
irb(main):003:0> s.split( /(?: |,)/ )
=> [“a”, “b”, “c”]

On 1/12/07, James Edward G. II [email protected] wrote:

  d = d.gsub(/,/,' ') # replace comma by space

No need for a Regexp there:

a = d.tr(“;,”, " ").split

James Edward G. II

Nice one ( I thought I got it optimal, naaa)
and faster too of course :slight_smile:

robert@PC:~/log/ruby/ML 18:46:00
535/35 > ruby split.rb split.rb
Rehearsal ---------------------------------------------
regex 4.891000 0.000000 4.891000 ( 4.602000)
translate 3.812000 0.000000 3.812000 ( 3.685000)
------------------------------------ total: 8.703000sec

            user     system      total        real

regex 5.016000 0.000000 5.016000 ( 4.669000)
translate 3.859000 0.000000 3.859000 ( 3.805000)

Cheers
Robert

On 1/12/07, Josselin [email protected] wrote:

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

  d = d.gsub(/\r\n/,' ')   # get rid of carriage return
  d = d.gsub(/;/,' ')  # replace column by space
  d = d.gsub(/,/,' ') # replace comma by space
  a  = d.split(' ')   # split into component , space as divider

what about:

a = d.gsub(/\r\n|;|,/,’ ‘).split(’ ')

or

a = d.split(/\r\n|;|,| /)

(not tested, I’m too lazy/busy to write the tests now)

Josselin wrote:

Joss

Specific to this example, everyone else is right, and the best way is to
consolidate the regex or simply use a condensed split call. However, in
the general case, you could do this

[ /\r\n/ , /;/ , /,/].inject(d) { |s,reg| s.gsub(reg,’ ‘) }.split(’ ')

On 1/12/07, Simon S. [email protected] wrote:

On 1/12/07, [email protected] [email protected] wrote:
[snip]

Out of interest, what does the ?: do in there? I’ve googled, etc, honest!

(?: ) is a non-capturing group

example if you want to match a repeating pattern,
but don’t want the repeating stuff in your output

“abcde xyx xyx xyx abcde”.scan(/(?:xyx ){2,}.*(b.*d)/)
#=> [[“bcd”]]

if you use ( ) then it shows up in the output

“abcde xyx xyx xyx abcde”.scan(/(xyx ){2,}.*(b.*d)/)
#=> [["xyx ", “bcd”]]

Phrogz wrote:

James Edward G. II wrote:

a = d.split(/(?:\r\n|[;, ])/)

Way more elegant. Way to see beyond the step-by-step process to the end
goal.

Except that there’s no need for the non-capturing group, so
(simplifying, not golfing):

a = d.split( /\r\n|[;, ]/ )

Unless, of course, you have a string like this:
d = “foo; bar\r\n\r\nwhee”
and you only wanted [ “foo”, “bar”, “whee” ], in which case:
a = d.split(/(?:\r\n|[;, ])+/)

\Man, I’m the king of multiple posting today
\Fark slashies ftw!
\\Back to work

On Sat, Jan 13, 2007 at 09:51:47AM +0900, James B. wrote:

“\r\n”,
“;”,
“,”
].map{ |c| Regexp.escape© }.join("|"))

a = d.split(DELIMITERS)

Unless these chunks of code are right next to each other, it may be hard
to know the purpose for the delimiters or what’s driving the split.

The cleaned up version includes the delimiters in an array of individual
strings. Your original complaint was about readability and code
maintenance. While I agree that a long literal Regexp can be hard to
read
and hard to maintain, you can achieve the same efficiency of that Regexp
without sacrificing readability using the solution above. Perhaps the
following would make you happier?

module Whatever
DELIMITERS = [
" ",
“\r\n”,
“;”,
“,”
]

def split_string(str)
@delimiter_regexp ||= Regexp.new(DELIMITERS.map{ |c|
Regexp.escape© }.join("|"))
str.split(@delimiter_regexp)
end
extend self
end

a = Whatever.split_string(d)

(If you want to make it even fancier so you can modify DELIMITERS at
runtime you’ll have to do something clever with hashes.)

If the code above does not fulfill what you were intending, please do
explain why; if I’ve missed the point, I’d like to know it and to try
again
at understanding.

James B.
–Greg

On 1/12/07, Phrogz [email protected] wrote:

\Man, I’m the king of multiple posting today
\Fark slashies ftw!
\\Back to work

+1

:slight_smile:

Josselin [email protected] writes:

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

 d = d.gsub(/\r\n/,' ')   # get rid of carriage return
 d = d.gsub(/;/,' ')  # replace column by space
 d = d.gsub(/,/,' ') # replace comma by space
 a  = d.split(' ')   # split into component , space as divider

What’s wrong with:

a = d.split(/\r\n|[;, ]/)

Or do you need d to be mangled as before?

Although I probably would do something even shorter like this:

a = d.split(/[;,\s]+/)

However, for certain inputs that won’t give exactly the same as your
initial multi-step procedure.

Also, any time you write:

d = d.gsub(…)

You’re probably better off with:

d.gsub!(…)

James Edward G. II wrote:

a = d.split(/(?:\r\n|[;, ])/)

Way more elegant. Way to see beyond the step-by-step process to the end
goal.

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

  d = d.gsub(/\r\n/,' ')   # get rid of carriage return
  d = d.gsub(/;/,' ')  # replace column by space
  d = d.gsub(/,/,' ') # replace comma by space
  a  = d.split(' ')   # split into component , space as divider

There’s nothing in these four lines of code that violates the idea of
DRY. There is no repeated code. Multiple calls to the same method
are perfectly OK.

xoa