A better way for what is currently nested substitution

addis_a · September 7, 2013, 10:50pm

There must be a better way to do this:

numstr = '17      75      -1      25      -1      52  43  37  0'
=> "17      75      -1      25      -1      52  43  37  0"
numstr.sub(/(.{46})/) {|a| a.sub(/(..$)/) {|b| b.to_i + 3 }.to_s }
=> "17      75      -1      25      -1      52  46  37  0"

What is the better way? (Note that in practice I’m using this to
operate on several strings in series, where each string has a number in
the same columnar position that needs to be incremented by three.)

cpeterson · September 7, 2013, 11:40pm

Within that particular example, with an exact position:
numstr[44…45] = (numstr[44…45].to_i + 3).to_s

cpeterson · September 9, 2013, 5:56pm

On Sat, Sep 07, 2013 at 11:40:28PM +0200, Joel P. wrote:

Within that particular example, with an exact position:
numstr[44…45] = (numstr[44…45].to_i + 3).to_s

Hmm.

My actual use case, as it happens, involves shelling out to Ruby from
Vim, and I do not at first glance see how this would fit into that
approach. Here’s my original approach, as a Vim command:

:.!ruby -pe 'sub(/(.{46})/) {|a| a.sub(/(..$)/) {|b| b.to_i + 3

}.to_s }’

I’ll have to come back to this after more caffeine and see if your
approach can be made to fit without becoming uglier than what I ended up
doing.

cpeterson · September 9, 2013, 6:00pm

On Mon, Sep 09, 2013 at 10:34:51AM +0200, Robert K. wrote:

=> 46
irb(main):007:0> nums
=> [17, 75, -1, 25, -1, 52, 46, 37, 0]
irb(main):008:0> numstr = nums.join ’ ’
=> “17 75 -1 25 -1 52 46 37 0”

Unfortunately, this mangles the spacing between columns, which I do
actually need to preserve.

For the fun of it a columnar replacement could be done like this:

irb(main):009:0> numstr = ‘17 75 -1 25 -1 52
43 37 0’
=> “17 75 -1 25 -1 52 43 37 0”
irb(main):010:0> pos = 0
=> 0
irb(main):011:0> numstr.gsub!(/[-+]?\d+/) {|m| (pos += 1) == 7 ? m.to_i + 3 : m}
=> “17 75 -1 25 -1 52 46 37 0”

Given the contents of the strings in question, the regex could be
simplified to /-?\d+/ because positives are never indicated by plus
signs. This is a pretty good solution to the problem as I stated it,
but I unfortunately forgot to mention that sometimes some columns might
be empty – so you get a blue ribbon for solving the problem I
described, even if it doesn’t solve the problem I actually had to solve.

Thanks for the ideas for how to write a better substitution routine than
I had. These may not actually work in this specific case, but it’s
always instructive to see what approaches to solving problems come from
different perspectives.

cpeterson · September 9, 2013, 10:35am

On Sat, Sep 7, 2013 at 11:40 PM, Joel P. [email protected]
wrote:

Within that particular example, with an exact position:
numstr[44…45] = (numstr[44…45].to_i + 3).to_s

Normally I’d recommend using a proper representation, i.e. an Array of
Fixnums:

irb(main):004:0> numstr = ‘17 75 -1 25 -1 52
43 37 0’
=> “17 75 -1 25 -1 52 43 37 0”
irb(main):005:0> nums = numstr.scan(/[-+]?\d+/).map(&:to_i)
=> [17, 75, -1, 25, -1, 52, 43, 37, 0]
irb(main):006:0> nums[6] += 3
=> 46
irb(main):007:0> nums
=> [17, 75, -1, 25, -1, 52, 46, 37, 0]
irb(main):008:0> numstr = nums.join ’ ’
=> “17 75 -1 25 -1 52 46 37 0”

For the fun of it a columnar replacement could be done like this:

irb(main):009:0> numstr = ‘17 75 -1 25 -1 52
43 37 0’
=> “17 75 -1 25 -1 52 43 37 0”
irb(main):010:0> pos = 0
=> 0
irb(main):011:0> numstr.gsub!(/[-+]?\d+/) {|m| (pos += 1) == 7 ? m.to_i

3 : m}
=> “17 75 -1 25 -1 52 46 37 0”

Kind regards

robert

cpeterson · September 9, 2013, 7:01pm

On Mon, Sep 9, 2013 at 5:59 PM, Chad P. [email protected] wrote:

irb(main):006:0> nums[6] += 3
=> 46
irb(main):007:0> nums
=> [17, 75, -1, 25, -1, 52, 46, 37, 0]
irb(main):008:0> numstr = nums.join ’ ’
=> “17 75 -1 25 -1 52 46 37 0”

Unfortunately, this mangles the spacing between columns, which I do
actually need to preserve.

Yes, I figured that was a downside of the approach. Hence I tried the
text based solution.

Given the contents of the strings in question, the regex could be
simplified to /-?\d+/ because positives are never indicated by plus
signs.

Plus signs are rarely use, but since you did not mention this I wanted
to stay on the safe side.

This is a pretty good solution to the problem as I stated it,
but I unfortunately forgot to mention that sometimes some columns might
be empty – so you get a blue ribbon for solving the problem I
described, even if it doesn’t solve the problem I actually had to solve.

LOL

If arbitrary columns can be empty (i.e. empty strings), if whitespace
between columns does not have a fixed length and if integer values can
be arbitrary then there is no general solution to your problem because
the system cannot know which column you intend to change. I think you
need to impose at least one of these restrictions to be able to come
up with an automated solution.

Thanks for the ideas for how to write a better substitution routine than
I had. These may not actually work in this specific case, but it’s
always instructive to see what approaches to solving problems come from
different perspectives.

That’s absolutely true!

Kind regards

robert

cpeterson · September 10, 2013, 1:04am

On Mon, Sep 09, 2013 at 09:52:52AM -0600, Chad P. wrote:

:.!ruby -pe 'sub(/(.{46})/) {|a| a.sub(/(..$)/) {|b| b.to_i + 3 }.to_s }'
I’ll have to come back to this after more caffeine and see if your
approach can be made to fit without becoming uglier than what I ended up
doing.

. . . and after thinking about this and experimenting a little, I am
momentarily stymied by the fact that the way data is passed around
between Vim, the shell, ruby, the shell, and Vim again is a bit like
black magic to me just now. I’ll have to get back to this later.

cpeterson · September 10, 2013, 12:52am

On Mon, Sep 09, 2013 at 07:01:01PM +0200, Robert K. wrote:

On Mon, Sep 9, 2013 at 5:59 PM, Chad P. [email protected] wrote:

This is a pretty good solution to the problem as I stated it,
but I unfortunately forgot to mention that sometimes some columns might
be empty – so you get a blue ribbon for solving the problem I
described, even if it doesn’t solve the problem I actually had to solve.

LOL

Yeah, that’s pretty much the correct response to what I said.

If arbitrary columns can be empty (i.e. empty strings), if whitespace
between columns does not have a fixed length and if integer values can
be arbitrary then there is no general solution to your problem because
the system cannot know which column you intend to change. I think you
need to impose at least one of these restrictions to be able to come
up with an automated solution.

The columns are of fixed, but not uniform, width – based on multiples
of four character width tab-stops, essentially, but using spaces instead
of tabs. Thus, the problem is basically confined to solutions that
operate on specific character columns (that is, the Nth character from
the left-hand margin).

cpeterson · September 10, 2013, 10:24am

On Tue, Sep 10, 2013 at 12:51 AM, Chad P. [email protected] wrote:

Yeah, that’s pretty much the correct response to what I said.

operate on specific character columns (that is, the Nth character from
the left-hand margin).

If I understand you correctly then it is not possible to derive from a
single line how much spacing between two columns exists. Then I guess
there is no solution which can be applied by handing off a single line
from vim to ruby -pe. If you need that functionality more often, I
guess this warrants a full fledged Ruby script with proper inputs
which operates on the whole file.

Kind regards

robert

cpeterson · September 10, 2013, 12:12pm

How about
numstr.split( /\b/ )

=> [“17”, " “, “75”, " -”, “1”, " “, “25”, " -”,
“1”, " ", “52”, " ", “43”, " ", “37”, " ", “0”]

Ok, at the moment it doesn’t handle minuses correctly; but with a bit of
work to iron that out, and with some rules on handling “empty” columns
(are they padded with spaces or 0 length?), that might be a place to
start.

cpeterson · September 10, 2013, 2:10pm

On Tue, Sep 10, 2013 at 12:12 PM, Joel P. [email protected]
wrote:

start.
That still does not work since it is based solely on a single line.
With columns optionally empty you cannot derive from a single line the
width of the whitespace separators. This is nothing which can be
solved by any amount of coding magic done with a single line. You
either need more lines or additional information which is fed into the
process. But information contained in a single line does not allow to
identify column positions properly.

Cheers

robert

cpeterson · September 11, 2013, 4:46am

On Tue, Sep 10, 2013 at 10:23:37AM +0200, Robert K. wrote:

The columns are of fixed, but not uniform, width – based on multiples
of four character width tab-stops, essentially, but using spaces instead
of tabs. Thus, the problem is basically confined to solutions that
operate on specific character columns (that is, the Nth character from
the left-hand margin).

If I understand you correctly then it is not possible to derive from a
single line how much spacing between two columns exists. Then I guess
there is no solution which can be applied by handing off a single line
from vim to ruby -pe.

. . . except the solution I used, then shared here hoping there was
something more elegant, I suppose. It’s not a general solution, but it
worked for the specific case in which I used it. I was just dismayed by
the uglitude of nesting a block-fed String#sub inside the block of
another String#sub.

If you need that functionality more often, I guess this warrants a
full fledged Ruby script with proper inputs which operates on the
whole file.

I guess so. C’est la vie.

cpeterson · September 11, 2013, 4:47am

Alfred Aho
Peter Weinberger
Brian Kernighan

Err . . . okay.

cpeterson · September 11, 2013, 4:50am

On Tue, Sep 10, 2013 at 02:09:53PM +0200, Robert K. wrote:

(are they padded with spaces or 0 length?), that might be a place to
start.

That still does not work since it is based solely on a single line.
With columns optionally empty you cannot derive from a single line the
width of the whitespace separators. This is nothing which can be
solved by any amount of coding magic done with a single line. You
either need more lines or additional information which is fed into the
process. But information contained in a single line does not allow to
identify column positions properly.

To be precise, no general solution seems possible given the
restrictions imposed, though using split on /\b/ would have sufficed for
the one-off usage I needed at the time. I’d really like to have found
out there was a general solution, though, using some trick I had
overlooked.

I’d also like to have found out there was a way to pass multiple
parameters to the block for String#sub that would allow me to operate on
a pair of match captures, but the way String#sub works it only allows
one capture to be passed to the block as a parameter. I’m not really
sure why it was designed that way, unless there was some performance
constraint for the implementation.

cpeterson · September 10, 2013, 8:37am

On Mon, Sep 9, 2013 at 6:03 PM, Chad P. [email protected] wrote:

approach. Here’s my original approach, as a Vim command:
between Vim, the shell, ruby, the shell, and Vim again is a bit like
black magic to me just now. I’ll have to get back to this later.

–
Chad P. [ original content licensed OWL: http://owl.apotheon.org ]

Alfred Aho
Peter Weinberger
Brian Kernighan

cpeterson · September 11, 2013, 8:07am

If you never have a 3 digit number and the spacing never changes,

m = “17 75 -1 25 -1 52 43 37 0”
n = " 75 52 43 37 0"

p m
p m.scan(/.{1,2}/).tap{|a|
a[22],a[24]=(a[22].to_i+3).to_s,(a[24].to_i+4).to_s}.join
puts
p n
p n.scan(/.{1,2}/).tap{|a|
a[22],a[24]=(a[22].to_i+3).to_s,(a[24].to_i+4).to_s}.join

This may not solve your problem but maybe it will give you some other
ideas.

Harry