Extract numbers from string


#1

Say I have a phone number “(555) 555-5555” and I want to extract the
numbers from it. What’s the most succinct way to do this? E.g.:

old_num = “(555) 111-5555”
new_num = “5551115555”

old_num = “555-666-7777”
new_num = “5556667777”


#2

On 9/25/07, eggie5 removed_email_address@domain.invalid wrote:

Hi,

I would try
new_num = old_num.scan(/\d/).join(’’)

Hope that does what you want

Nick


#3

eggie5 schrieb:

Say I have a phone number “(555) 555-5555” and I want to extract the
numbers from it. What’s the most succinct way to do this? E.g.:

old_num = “(555) 111-5555”
new_num = “5551115555”

old_num = “555-666-7777”
new_num = “5556667777”

Regexp and gsub is an answer.

old_num = “(555) 111-5555”
p old_num.gsub(/[^0-9]/, ‘’)
==> “5551115555”

old_num = “555-666-7777”
p old_num.gsub(/[^0-9]/, ‘’)
==> “5556667777”

BR Phil


#4

From: David A. Black [mailto:removed_email_address@domain.invalid]

#Don’t forget \D:

old_num.gsub(/\D/, “”)

arggh, why do i always forget the bigcases :frowning:

anyway, fr now on if i think of reverse cases, think of \D-black :slight_smile:

thank you and kind regards -botp


#5

From: Nicholas Clare [mailto:removed_email_address@domain.invalid]

new_num = old_num.scan(/\d/).join(’’)

                      lose this ^^

irb(main):003:0> old_num.scan(/\d/).join
=> “5551115555”

also,

irb(main):014:0> old_num.gsub(/[^\d]/,"")
=> “5551115555”
irb(main):015:0> old_num.tr("^0-9","")
=> “5551115555”
irb(main):018:0> old_num.delete("^0-9")
=> “5551115555”
irb(main):025:0> old_num.split(/[^\d]/).join
=> “5551115555”

i’d hope that ruby-doc can doc such related methods… something like
documenting “The Many-Ways” of ruby…

kind regards -botp


#6

On 9/25/07, Peña, Botp removed_email_address@domain.invalid wrote:

From: David A. Black [mailto:removed_email_address@domain.invalid]

#Don’t forget \D:

old_num.gsub(/\D/, “”)

I always wanted try the benchmark stuff, and this seemed like a good
opportunity :-).

cat remove_nondigits.rb && ruby remove_nondigits.rb

remove_nondigits.rb

25 September 2007

require ‘benchmark’

n = 100_000
old_num = “(555) 55-555-55”

Benchmark.bmbm do |x|
x.report(“scan”) do
n.times {old_num.scan(/\d/).join}
end
x.report(“gsub”) do
n.times {old_num.gsub(/[^\d]/,"")}
end
x.report(“gsub2”) do
n.times {old_num.gsub(/\D/, “”)}
end
x.report(“tr”) do
n.times {old_num.tr("^0-9","")}
end
x.report(“delete”) do
n.times {old_num.delete("^0-9")}
end
x.report(“split”) do
n.times {old_num.split(/[^\d]/).join}
end
end
Rehearsal ------------------------------------------
scan 1.080000 0.000000 1.080000 ( 1.144654)
gsub 0.380000 0.010000 0.390000 ( 0.403057)
gsub2 0.390000 0.010000 0.400000 ( 0.416074)
tr 0.290000 0.020000 0.310000 ( 0.320764)
delete 0.270000 0.010000 0.280000 ( 0.294078)
split 0.700000 0.000000 0.700000 ( 0.714139)
--------------------------------- total: 3.160000sec

         user     system      total        real

scan 1.080000 0.020000 1.100000 ( 1.101550)
gsub 0.380000 0.010000 0.390000 ( 0.404968)
gsub2 0.380000 0.010000 0.390000 ( 0.409364)
tr 0.290000 0.020000 0.310000 ( 0.318723)
delete 0.280000 0.010000 0.290000 ( 0.294650)
split 0.700000 0.000000 0.700000 ( 0.713261)

So it seems that in terms of speed, delete wins, second place for tr,
and in the fight between [^\d] and \D the first one wins by a little
(very little) margin :-).

Cheers,

Jesus.


#7

From: Jesús Gabriel y Galán [mailto:removed_email_address@domain.invalid]

I always wanted try the benchmark stuff, and this seemed

like a good opportunity :-).

wow. thanks for the bnchmark, Jesus.

kind regards -botp


#8

Hi –

On Tue, 25 Sep 2007, Peña, Botp wrote:

=> “5551115555”
Don’t forget \D:

old_num.gsub(/\D/, “”)

David


#9

Peña, Botp wrote:

From: Jesús Gabriel y Galán [mailto:removed_email_address@domain.invalid]

I always wanted try the benchmark stuff, and this seemed

like a good opportunity :-).

wow. thanks for the bnchmark, Jesus.

kind regards -botp

I find my self most often reaching for gsub! or tr but hey, that helps
:slight_smile:

Thanks

TerryP.


#10

On Sep 25, 1:31 am, “Jesús Gabriel y Galán” removed_email_address@domain.invalid
wrote:

Benchmark.bmbm do |x|
n.times {old_num.tr("^0-9","")}
gsub 0.380000 0.010000 0.390000 ( 0.403057)
tr 0.290000 0.020000 0.310000 ( 0.318723)
delete 0.280000 0.010000 0.290000 ( 0.294650)
split 0.700000 0.000000 0.700000 ( 0.713261)

So it seems that in terms of speed, delete wins, second place for tr,
and in the fight between [^\d] and \D the first one wins by a little
(very little) margin :-).

Cheers,

Jesus.

That’s awesome, thanks.