Remove commas from string

I have following string:

s = “B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs.”

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

“B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs.”

I thought this following regex would do the trick because it
successfully isolated the right commas in rubular.com

s.gsub(/\d+(,)\d+/, “”)

It turns out that my regex removes the entire number, not just the
comma.

Am I wrong in saying that my regex searches for 1 or more numbers
surrounding a comma and replaces just the comma with “”?

Thank you.

Jason L. wrote:

I have following string:

s = “B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs.”

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

“B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs.”

I thought this following regex would do the trick because it
successfully isolated the right commas in rubular.com

s.gsub(/\d+(,)\d+/, “”)

It turns out that my regex removes the entire number, not just the
comma.

Am I wrong in saying that my regex searches for 1 or more numbers
surrounding a comma and replaces just the comma with “”?

Thank you.

one simple one here:
new_string = s.gsub(",","")
p new_string

Jason L. wrote:

Thank you.

Hi Jason

there is an example of how to do this in the gsub documentation

irb(main):007:0> s = “B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney
PW 4056 tubofans, 56,000 lbs”
=> “B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056 tubofans,
56,000 lbs”
irb(main):008:0> s.gsub(/(\d+),(\d+)/,’\1\2’)
=> “B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 tubofans,
56000 lbs”

The regexp specifies a group of digits a comma and a group of digits
The brackets indicate that the groups should be remembered
\1 substitutes in the first match and \2 the second. Note that you need
single quotes around the string or \1 will be interpreted as octal 001
and \2 as octal 002

Hope this helps

Steve

Jason L. wrote:

I have following string:

s = “B747-400, 8,357 miles, 561 mph, 4 Pratt & Whitney PW 4056
turbofans, 56,000 lbs.”

I want to remove the comma only from the numbers (8,357 miles and 56,000
lbs) separating the thousands. I want the string to read as follows:

“B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
56000 lbs.”

I thought this following regex would do the trick because it
successfully isolated the right commas in rubular.com

s.gsub(/\d+(,)\d+/, “”)

It turns out that my regex removes the entire number, not just the
comma.

Am I wrong in saying that my regex searches for 1 or more numbers
surrounding a comma and replaces just the comma with “”?

Yes, you are wrong:

s = “|yes|”
puts s.gsub(/y(e)s/, “”)

–output:–
||

gsub() replaces the whole match with the specified replacement. gsub()
does not pick out a parenthesized group in the regex and replace that
with the specified replacement . However, there is a block form of
gsub:

s = “yes, 1,234, yes, 4,567”

result = s.gsub(/(\d),(\d)/) do |match|
“#{$1}#{$2}”
end

puts result

–output:–
yes, 1234, yes, 4567

Inside the block, $1, $2, $3, etc. refer to the matches for each
parenthesized group in the regex. The return value of the block is used
as the replacement.

56000 lbs."
Ruby 1.9 supports look-behind in regular expressions (Ruby 1.8
only supports look-ahead):

$ irb1.9

irb(main):001:0> s = “B747-400, 8,357 miles, 561 mph, 56,000 lbs.”
=> “B747-400, 8,357 miles, 561 mph, 56,000 lbs.”

irb(main):002:0> s.gsub(/(?<=\d),(?=\d)/, ‘’)
=> “B747-400, 8357 miles, 561 mph, 56000 lbs.”

7stud – wrote:

Inside the block, $1, $2, $3, etc. refer to the matches for each
parenthesized group in the regex. The return value of the block is used
as the replacement.

But note that once again, the entire match is replaced by the return
value of the block.

On Aug 5, 2009, at 3:55 AM, Robert K. wrote:

lbs) separating the thousands. I want the string to read as follows:
=> “B747-400, 8,357 miles, 561 mph, 56,000 lbs.”

robert

Why so complex? Perhaps:

s.gsub(/\b(\d+),(\d+)\b/, ‘\1\2’)
=> “B747-400, 8357 miles, 561 mph, 56000 lbs.”

Is there a corner case I’m missing here?

Ruby has the neat ability to pass a block to gsub. This can be a more
versatile solution than using backreferences. It also allows Jason to
use his original, straight-forward regex. The matched string is passed
to the block, and whatever the block evaluates to is used as the
replacement string. Check it out:

irb(main):012:0> s
=> “B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
turbofans, 56,000 lbs.”
irb(main):013:0> s.gsub(/\d+,\d+/) { |subs| subs.gsub(’,’, ‘’) }
=> “B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
turbofans, 56000 lbs.”

s.ross schrieb:

=> “B747-400, 8357 miles, 561 mph, 56000 lbs.”
Is there a corner case I’m missing here?
This would make 2344,667 from 2,344,667 (leave the second comma). (Why
the "\b"s and not just /(\d),(\d{3})/ ?)

To clean up sums like those that are made as gifts for banks in this
times (like 1,000,000,000,000) in 1.8 you need something like

s.gsub /\d(,\d{3})+/ do |mo|; mo.gsub ‘,’,’’; end

R.

2009/8/6 Nick B. [email protected]:

Ruby has the neat ability to pass a block to gsub. This can be a more
versatile solution than using backreferences.

It isn’t needed though in this case. Please also note that the block
form is usually slower.

The block form is most appropriate if you need to calculate each
replacement individually.

turbofans, 56000 lbs."
Frankly, I’d rather use any of the other non block solutions that the
one with a block. My 0.02EUR.

Kind regards

robert

Steve R. wrote:

Why so complex? Perhaps:

s.gsub(/\b(\d+),(\d+)\b/, ‘\1\2’)
=> “B747-400, 8357 miles, 561 mph, 56000 lbs.”

Is there a corner case I’m missing here?

So I think I would like to try a non-block option but this code above
misses the second comma if I have say 12,134,650 lbs instead of 56,000
lbs in my string.

It looks like I need Ruby 1.9 to do this: s.gsub(/(?<=\d),(?=\d{3})/,
‘’)

If I have not yet installed Ruby 1.9, what would be a good non-block
regex that is more robust?

2009/8/6 Jason L. [email protected]:

misses the second comma if I have say 12,134,650 lbs instead of 56,000
lbs in my string.

It looks like I need Ruby 1.9 to do this: s.gsub(/(?<=\d),(?=\d{3})/,
‘’)

If I have not yet installed Ruby 1.9, what would be a good non-block
regex that is more robust?

s.gsub(/(\d),(?=\d{3})/,’\1’)

Cheers

robert

2009/8/5 Lars H. [email protected]:

"B747-400, 8357 miles, 561 mph, 4 Pratt & Whitney PW 4056 turbofans,
irb(main):002:0> s.gsub(/(?<=\d),(?=\d)/, ‘’)
=> “B747-400, 8357 miles, 561 mph, 56000 lbs.”

I’r rather do this to be a bit more robust:

irb(main):003:0> s.gsub(/(?<=\d),(?=\d{3})/, ‘’)
=> “B747-400, 8357 miles, 561 mph, 56000 lbs.”

Kind regards

robert

Robert K. wrote:

s.gsub(/(\d),(?=\d{3})/,’\1’)

Thank you very much Robert.

And thanks to everyone else. This has been a good learning experience
for me.

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs