DRY gsub

Josselin · January 19, 2007, 4:31pm

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

  d = d.gsub(/\r\n/,' ')   # get rid of carriage return
  d = d.gsub(/;/,' ')  # replace column by space
  d = d.gsub(/,/,' ') # replace comma by space
  a  = d.split(' ')   # split into component , space as divider

tfyl

Joss

Josselin · September 25, 2007, 10:29pm

On 2007-01-12 17:11:54 +0100, “Phrogz” [email protected] said:

method repeatedly but with different parameters is not "repeating
code more compact (but not necessarily golfing).
Thanks to all of you… as a newbie I try to keep this kind of useful
comment in my mind DRY vs WET
(it’s now engraved…)

Josselin · September 25, 2007, 10:29pm

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc,
honest!

Cheers,
Benjohn

Josselin · September 25, 2007, 10:29pm

Gregory S. wrote:

Cleaned up:

The whole point was not to clean it up, but to make obvious what and
why something was happening in the code.

Brevity is the soul of wit, but it can play havoc with code maintenance.

DELIMITERS = Regexp.new([
" “,
“\r\n”,
“;”,
“,”
].map{ |c| Regexp.escape© }.join(”|"))

a = d.split(DELIMITERS)

Unless these chunks of code are right next to each other, it may be hard
to know the purpose for the delimiters or what’s driving the split.

Josselin · September 25, 2007, 10:30pm

On Jan 12, 2007, at 10:00 AM, Josselin wrote:

I wrote the following ruby statements… I get the result I need ,
I tried to DRY it for 2 hours without being successfull ,
 d = d.gsub(/\r\n/,' ')   # get rid of carriage return
 d = d.gsub(/;/,' ')  # replace column by space
 d = d.gsub(/,/,' ') # replace comma by space
 a  = d.split(' ')   # split into component , space as divider

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

James Edward G. II

Josselin · September 25, 2007, 10:30pm

On 1/12/07, [email protected] [email protected] wrote:

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc, honest!

(?: ) is a non-capturing group

its not necessary here
“ab;c,xx\r\nyy zz”.split(/\r\n|[;, ]/)
#=> [“ab”, “c”, “xx”, “yy”, “zz”]

Josselin · September 25, 2007, 10:32pm

Bira a écrit :

How about a = d.gsub(/\r\n|;|,/,’ ‘).split(’ ') ?

Or a = d.split(/\r\n|,|;| /) ?

Josselin · September 25, 2007, 10:33pm

2007/1/12, [email protected] [email protected]:

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc, honest!

( ) groups and capture
(?: ) groups but does not capture

Non-capturing groupings:
http://perldoc.perl.org/perlretut.html

It’s irrelevant in this case anyways

Josselin · September 25, 2007, 10:33pm

Phrogz wrote:

method repeatedly but with different parameters is not “repeating
yourself”.

Looking at this, and some of the suggested alternatives, I can see how
it would get tedious to add more characters to the “replace with space”
set.

The use of compact regular expressions doesn’t make the code easier to
read or maintain.

It may be useful to define the set of special characters, then use that
to drive a string transformation.

REPLACE_WITH_SPACE = %w{
\r\n
;
,
}.map{ |c| Regexp.new(c) }

class String
def swap_to_spaces
s = self.dupe
REPLACE_WITH_SPACE.each do |re|
s.gsub!( re, ’ ')
end
s
end
end

a = d.swap_to_spaces.split( ’ ’ )

Or something along those lines.

–
James B.

http://www.ruby-doc.org - Ruby Help & Documentation
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

Josselin · September 25, 2007, 10:34pm

On 12.01.2007 17:26, [email protected] wrote:

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc, honest!

It’s a non capturing group. You cannot get the characters from it which
is at times more efficient because the RX engine does not need to do the
bookkeeping and storing of the group.

robert

Josselin · September 25, 2007, 10:35pm

Josselin wrote:

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,
  d = d.gsub(/\r\n/,' ')   # get rid of carriage return
  d = d.gsub(/;/,' ')  # replace column by space
  d = d.gsub(/,/,' ') # replace comma by space
  a  = d.split(' ')   # split into component , space as divider

BTW, that is already reasonably DRY, in my opinion. Calling the same
method repeatedly but with different parameters is not “repeating
yourself”. It would be WET (hrm…Way Extra Toomuchcode) if you had
something like:

d = d.gsub( /\r\n/, ’ ’ )
e = e.gsub( /\r\n/, ’ ’ )
f = f.gsub( /\r\n/, ’ ’ )
g = g.gsub( /\r\n/, ’ ’ )
etc.

It’s just semantics, but IMO what you’re asking for is to make your
code more compact (but not necessarily golfing).

Josselin · September 25, 2007, 10:33pm

On Sat, Jan 13, 2007 at 02:04:31AM +0900, James B. wrote:
[…]

REPLACE_WITH_SPACE = %w{
end
s
end
end

a = d.swap_to_spaces.split( ’ ’ )

Or something along those lines.

Cleaned up:

DELIMITERS = Regexp.new([
" “,
“\r\n”,
“;”,
“,”
].map{ |c| Regexp.escape© }.join(”|"))

a = d.split(DELIMITERS)

James B.
–Greg

Josselin · September 25, 2007, 10:35pm

Josselin wrote:

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,
  d = d.gsub(/\r\n/,' ')   # get rid of carriage return
  d = d.gsub(/;/,' ')  # replace column by space
  d = d.gsub(/,/,' ') # replace comma by space
  a  = d.split(' ')   # split into component , space as divider

d = d.gsub( /\r\n|[;,]/, ’ ’ ).split

Josselin · September 25, 2007, 10:36pm

On 1/12/07, Daniel M. [email protected] wrote:

However, for certain inputs that won’t give exactly the same as your
initial multi-step procedure.

Also, any time you write:

d = d.gsub(…)

You’re probably better off with:

d.gsub!(…)

…unless you don’t want to modify the original object passed as
argument (I’m not sure if this is proper English construct I mean
in that case the caller will see the modifications as well)

Josselin · September 25, 2007, 10:36pm

On Jan 12, 2007, at 10:26 AM, [email protected] wrote:

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc,
honest!

(?: … ) is just ( … ) without capturing the contents into a
variable.

James Edward G. II

Josselin · September 25, 2007, 10:37pm

On Jan 12, 2007, at 10:30 AM, Phrogz wrote:

a = d.split( /\r\n|[;, ]/ )

You’re right, it’s not needed. I’m just in the habit of always
surrounding | options of a regex with grouping to control their
scope. I guess I’ve been bitten by those matching issues one time
too many.

James Edward G. II

Josselin · September 25, 2007, 10:38pm

Josselin wrote:

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,
 d = d.gsub(/\r\n/,' ')   # get rid of carriage return
 d = d.gsub(/;/,' ')  # replace column by space
 d = d.gsub(/,/,' ') # replace comma by space
 a  = d.split(' ')   # split into component , space as divider

a = d.split(/(\r\n)|([;, ])/)

Josselin · September 25, 2007, 10:37pm

Josselin wrote:

Joss

I would probably go with

a = d.chop.split(/[\s,;]/)

Best regards,
Henrik S.

Josselin · September 25, 2007, 10:38pm

On 1/12/07, Bira [email protected] wrote:

How about a = d.gsub(/\r\n|;|,/,’ ‘).split(’ ') ?

If you don’t care about \r then maybe this

“ab;c,xx\r\nyy zz”.scan(/[^ ;,\n]+/)
#=> [“ab”, “c”, “xx\r”, “yy”, “zz”]

Josselin · September 25, 2007, 10:38pm

On 2007-01-12 17:05:34 +0100, Bira [email protected] said:

How about a = d.gsub(/\r\n|;|,/,’ ‘).split(’ ') ?

thanks … did not notice that I could use the ‘|’ inside the gsub…
get stuck to [. and…]