# Extract numbers from string

Say I have a phone number “(555) 555-5555” and I want to extract the
numbers from it. What’s the most succinct way to do this? E.g.:

old_num = “(555) 111-5555”
new_num = “5551115555”

old_num = “555-666-7777”
new_num = “5556667777”

On 9/25/07, eggie5 [email protected] wrote:

Hi,

I would try
new_num = old_num.scan(/\d/).join(‘’)

Hope that does what you want

Nick

eggie5 schrieb:

Say I have a phone number “(555) 555-5555” and I want to extract the
numbers from it. What’s the most succinct way to do this? E.g.:

old_num = “(555) 111-5555”
new_num = “5551115555”

old_num = “555-666-7777”
new_num = “5556667777”

Regexp and gsub is an answer.

old_num = “(555) 111-5555”
p old_num.gsub(/[^0-9]/, ‘’)
==> “5551115555”

old_num = “555-666-7777”
p old_num.gsub(/[^0-9]/, ‘’)
==> “5556667777”

BR Phil

From: David A. Black [mailto:[email protected]]

#Don’t forget \D:

# old_num.gsub(/\D/, “”)

arggh, why do i always forget the bigcases

anyway, fr now on if i think of reverse cases, think of \D-black

thank you and kind regards -botp

From: Nicholas Clare [mailto:[email protected]]

# new_num = old_num.scan(/\d/).join(‘’)

``````                      lose this ^^
``````

irb(main):003:0> old_num.scan(/\d/).join
=> “5551115555”

also,

irb(main):014:0> old_num.gsub(/[^\d]/,“”)
=> “5551115555”
irb(main):015:0> old_num.tr(“^0-9”,“”)
=> “5551115555”
irb(main):018:0> old_num.delete(“^0-9”)
=> “5551115555”
irb(main):025:0> old_num.split(/[^\d]/).join
=> “5551115555”

i’d hope that ruby-doc can doc such related methods… something like
documenting “The Many-Ways” of ruby…

kind regards -botp

On 9/25/07, Peña, Botp [email protected] wrote:

From: David A. Black [mailto:[email protected]]

#Don’t forget \D:

# old_num.gsub(/\D/, “”)

I always wanted try the benchmark stuff, and this seemed like a good
opportunity :-).

cat remove_nondigits.rb && ruby remove_nondigits.rb

# 25 September 2007

require ‘benchmark’

n = 100_000
old_num = “(555) 55-555-55”

Benchmark.bmbm do |x|
x.report(“scan”) do
n.times {old_num.scan(/\d/).join}
end
x.report(“gsub”) do
n.times {old_num.gsub(/[^\d]/,“”)}
end
x.report(“gsub2”) do
n.times {old_num.gsub(/\D/, “”)}
end
x.report(“tr”) do
n.times {old_num.tr(“^0-9”,“”)}
end
x.report(“delete”) do
n.times {old_num.delete(“^0-9”)}
end
x.report(“split”) do
n.times {old_num.split(/[^\d]/).join}
end
end
Rehearsal ------------------------------------------
scan 1.080000 0.000000 1.080000 ( 1.144654)
gsub 0.380000 0.010000 0.390000 ( 0.403057)
gsub2 0.390000 0.010000 0.400000 ( 0.416074)
tr 0.290000 0.020000 0.310000 ( 0.320764)
delete 0.270000 0.010000 0.280000 ( 0.294078)
split 0.700000 0.000000 0.700000 ( 0.714139)
--------------------------------- total: 3.160000sec

``````         user     system      total        real
``````

scan 1.080000 0.020000 1.100000 ( 1.101550)
gsub 0.380000 0.010000 0.390000 ( 0.404968)
gsub2 0.380000 0.010000 0.390000 ( 0.409364)
tr 0.290000 0.020000 0.310000 ( 0.318723)
delete 0.280000 0.010000 0.290000 ( 0.294650)
split 0.700000 0.000000 0.700000 ( 0.713261)

So it seems that in terms of speed, delete wins, second place for tr,
and in the fight between [^\d] and \D the first one wins by a little
(very little) margin :-).

Cheers,

Jesus.

From: Jesús Gabriel y Galán [mailto:[email protected]]

# like a good opportunity :-).

wow. thanks for the bnchmark, Jesus.

kind regards -botp

Hi –

On Tue, 25 Sep 2007, Peña, Botp wrote:

=> “5551115555”
Don’t forget \D:

old_num.gsub(/\D/, “”)

David

Peña, Botp wrote:

From: Jesús Gabriel y Galán [mailto:[email protected]]

# like a good opportunity :-).

wow. thanks for the bnchmark, Jesus.

kind regards -botp

I find my self most often reaching for gsub! or tr but hey, that helps

Thanks

TerryP.

On Sep 25, 1:31 am, “Jesús Gabriel y Galán” [email protected]
wrote:

Benchmark.bmbm do |x|
n.times {old_num.tr(“^0-9”,“”)}
gsub 0.380000 0.010000 0.390000 ( 0.403057)
tr 0.290000 0.020000 0.310000 ( 0.318723)
delete 0.280000 0.010000 0.290000 ( 0.294650)
split 0.700000 0.000000 0.700000 ( 0.713261)

So it seems that in terms of speed, delete wins, second place for tr,
and in the fight between [^\d] and \D the first one wins by a little
(very little) margin :-).

Cheers,

Jesus.

That’s awesome, thanks.