DRY gsub


#1

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

  d = d.gsub(/\r\n/,' ')   # get rid of carriage return
  d = d.gsub(/;/,' ')  # replace column by space
  d = d.gsub(/,/,' ') # replace comma by space
  a  = d.split(' ')   # split into component , space as divider

tfyl

Joss


#2

On 2007-01-12 17:11:54 +0100, “Phrogz” removed_email_address@domain.invalid said:

method repeatedly but with different parameters is not "repeating
code more compact (but not necessarily golfing).
Thanks to all of you… as a newbie I try to keep this kind of useful
comment in my mind DRY vs WET
(it’s now engraved…)


#3

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc,
honest!

Cheers,
Benjohn


#4

Gregory S. wrote:

Cleaned up:

The whole point was not to clean it up, but to make obvious what and
why something was happening in the code.

Brevity is the soul of wit, but it can play havoc with code maintenance.

DELIMITERS = Regexp.new([
" “,
“\r\n”,
“;”,
“,”
].map{ |c| Regexp.escape© }.join(”|"))

a = d.split(DELIMITERS)

Unless these chunks of code are right next to each other, it may be hard
to know the purpose for the delimiters or what’s driving the split.


#5

On Jan 12, 2007, at 10:00 AM, Josselin wrote:

I wrote the following ruby statements… I get the result I need ,
I tried to DRY it for 2 hours without being successfull ,

 d = d.gsub(/\r\n/,' ')   # get rid of carriage return
 d = d.gsub(/;/,' ')  # replace column by space
 d = d.gsub(/,/,' ') # replace comma by space
 a  = d.split(' ')   # split into component , space as divider

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

James Edward G. II


#6

On 1/12/07, removed_email_address@domain.invalid removed_email_address@domain.invalid wrote:

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc, honest!

(?: ) is a non-capturing group

its not necessary here
“ab;c,xx\r\nyy zz”.split(/\r\n|[;, ]/)
#=> [“ab”, “c”, “xx”, “yy”, “zz”]


#7

Bira a écrit :

How about a = d.gsub(/\r\n|;|,/,’ ‘).split(’ ') ?

Or a = d.split(/\r\n|,|;| /) ?


#8

2007/1/12, removed_email_address@domain.invalid removed_email_address@domain.invalid:

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc, honest!

( ) groups and capture
(?: ) groups but does not capture

Non-capturing groupings:
http://perldoc.perl.org/perlretut.html

It’s irrelevant in this case anyways


#9

Phrogz wrote:

method repeatedly but with different parameters is not “repeating
yourself”.

Looking at this, and some of the suggested alternatives, I can see how
it would get tedious to add more characters to the “replace with space”
set.

The use of compact regular expressions doesn’t make the code easier to
read or maintain.

It may be useful to define the set of special characters, then use that
to drive a string transformation.

REPLACE_WITH_SPACE = %w{
\r\n
;
,
}.map{ |c| Regexp.new© }

class String
def swap_to_spaces
s = self.dupe
REPLACE_WITH_SPACE.each do |re|
s.gsub!( re, ’ ')
end
s
end
end

a = d.swap_to_spaces.split( ’ ’ )

Or something along those lines.


James B.

http://www.ruby-doc.org - Ruby Help & Documentation
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys


#10

On 12.01.2007 17:26, removed_email_address@domain.invalid wrote:

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc, honest!

It’s a non capturing group. You cannot get the characters from it which
is at times more efficient because the RX engine does not need to do the
bookkeeping and storing of the group.

robert


#11

Josselin wrote:

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

  d = d.gsub(/\r\n/,' ')   # get rid of carriage return
  d = d.gsub(/;/,' ')  # replace column by space
  d = d.gsub(/,/,' ') # replace comma by space
  a  = d.split(' ')   # split into component , space as divider

BTW, that is already reasonably DRY, in my opinion. Calling the same
method repeatedly but with different parameters is not “repeating
yourself”. It would be WET (hrm…Way Extra Toomuchcode) if you had
something like:

d = d.gsub( /\r\n/, ’ ’ )
e = e.gsub( /\r\n/, ’ ’ )
f = f.gsub( /\r\n/, ’ ’ )
g = g.gsub( /\r\n/, ’ ’ )
etc.

It’s just semantics, but IMO what you’re asking for is to make your
code more compact (but not necessarily golfing).


#12

On Sat, Jan 13, 2007 at 02:04:31AM +0900, James B. wrote:
[…]

REPLACE_WITH_SPACE = %w{
end
s
end
end

a = d.swap_to_spaces.split( ’ ’ )

Or something along those lines.

Cleaned up:

DELIMITERS = Regexp.new([
" “,
“\r\n”,
“;”,
“,”
].map{ |c| Regexp.escape© }.join(”|"))

a = d.split(DELIMITERS)

James B.
–Greg


#13

Josselin wrote:

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

  d = d.gsub(/\r\n/,' ')   # get rid of carriage return
  d = d.gsub(/;/,' ')  # replace column by space
  d = d.gsub(/,/,' ') # replace comma by space
  a  = d.split(' ')   # split into component , space as divider

d = d.gsub( /\r\n|[;,]/, ’ ’ ).split


#14

On 1/12/07, Daniel M. removed_email_address@domain.invalid wrote:

However, for certain inputs that won’t give exactly the same as your
initial multi-step procedure.

Also, any time you write:

d = d.gsub(…)

You’re probably better off with:

d.gsub!(…)

…unless you don’t want to modify the original object passed as
argument (I’m not sure if this is proper English construct :wink: I mean
in that case the caller will see the modifications as well)


#15

On Jan 12, 2007, at 10:26 AM, removed_email_address@domain.invalid wrote:

a = d.split(/(?:\r\n|[;, ])/)

Hope that helps.

Out of interest, what does the ?: do in there? I’ve googled, etc,
honest!

(?: … ) is just ( … ) without capturing the contents into a
variable.

James Edward G. II


#16

On Jan 12, 2007, at 10:30 AM, Phrogz wrote:

a = d.split( /\r\n|[;, ]/ )

You’re right, it’s not needed. I’m just in the habit of always
surrounding | options of a regex with grouping to control their
scope. I guess I’ve been bitten by those matching issues one time
too many.

James Edward G. II


#17

Josselin wrote:

I wrote the following ruby statements… I get the result I need , I
tried to DRY it for 2 hours without being successfull ,

 d = d.gsub(/\r\n/,' ')   # get rid of carriage return
 d = d.gsub(/;/,' ')  # replace column by space
 d = d.gsub(/,/,' ') # replace comma by space
 a  = d.split(' ')   # split into component , space as divider

a = d.split(/(\r\n)|([;, ])/)


#18

Josselin wrote:

Joss

I would probably go with

a = d.chop.split(/[\s,;]/)

Best regards,
Henrik S.


#19

On 1/12/07, Bira removed_email_address@domain.invalid wrote:

How about a = d.gsub(/\r\n|;|,/,’ ‘).split(’ ') ?

If you don’t care about \r then maybe this

“ab;c,xx\r\nyy zz”.scan(/[^ ;,\n]+/)
#=> [“ab”, “c”, “xx\r”, “yy”, “zz”]


#20

On 2007-01-12 17:05:34 +0100, Bira removed_email_address@domain.invalid said:

How about a = d.gsub(/\r\n|;|,/,’ ‘).split(’ ') ?

thanks … did not notice that I could use the ‘|’ inside the gsub…
get stuck to [. and…]