Forum: Ruby extracting numbers from a string

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
A82c098905b53ae9ff331e0a6ff08976?d=identicon&s=25 Matt Jones (blahblah)
on 2007-06-12 08:45
I have filenames from various digital cameras:  DSC_1234.jpg,
CRW1234.jpg, etc.  What I really want is the numeric portion of that
filename.  How would I extract just that portion?


I expect it to involve the regex /\d+/, but I'm unclear how to extract a
portion of a string matching a regex.

Thank you
Caf38c89d40443a858741b61ac6d82de?d=identicon&s=25 Dan Zwell (Guest)
on 2007-06-12 09:21
(Received via mailing list)
Matt Jones wrote:
> I have filenames from various digital cameras:  DSC_1234.jpg,
> CRW1234.jpg, etc.  What I really want is the numeric portion of that
> filename.  How would I extract just that portion?
>
>
> I expect it to involve the regex /\d+/, but I'm unclear how to extract a
> portion of a string matching a regex.
>
> Thank you
>

This may be the simplest (and arguably the most ruby-esque):
str = "DSC_1234.jpg"
num = str.scan(/\d+/)[0]

Other ways to do it:
num = str.match(/\d+/)[0]

OR
num = (/\d+/).match(str)[0]

OR
num = str.scan(/\d+/) {|match| match}

OR
num = str =~ /(\d+)/ ? $1 : nil

That is,
num = if str =~ /(\d+)/
   $1
else
   nil
end

OR
if str =~ /\d+/
   num = $~[0]
end

Some proponents of ruby have said that perl's "There is more than one
way to do it," is a curse. But the same is true of ruby. However, it
seems to me that most people learn reasonable idioms and common sense
prevails.

Dan
Ee04bc0ca6dcdad4a7e8a8e1d4efb5d0?d=identicon&s=25 Michael W. Ryder (Guest)
on 2007-06-12 09:25
(Received via mailing list)
Matt Jones wrote:
> I have filenames from various digital cameras:  DSC_1234.jpg,
> CRW1234.jpg, etc.  What I really want is the numeric portion of that
> filename.  How would I extract just that portion?
>
>
> I expect it to involve the regex /\d+/, but I'm unclear how to extract a
> portion of a string matching a regex.
>
> Thank you
>
a = "DSC_1234.jpg"
b = a.gsub(/[^[:digit:]]/, '')
0bbfc4e9c95f7248c2f281344383f2a3?d=identicon&s=25 come (Guest)
on 2007-06-12 09:35
(Received via mailing list)
If you just want to extract one number from a string, you could write
something like :

if a="DSC_1234.jpg"

then a[/\d+/] will give you the first longest string of numbers, so
1234.

If you want to be more precise, you could use parenthesis to extract
the exact portion you want, like :

a[/DSC_(\d+)\.jpg/,1] (<=> a.match(/DSC_(\d+)\.jpg/)[1])

or even : a[/\ADSC_(\d+)\.jpg\Z/,1]
Fc12729d592e6d8cd98eaa8e0eec4240?d=identicon&s=25 Bas van Gils (Guest)
on 2007-06-12 09:56
(Received via mailing list)
On Tue, Jun 12, 2007 at 03:45:04PM +0900, Matt Jones wrote:
> I have filenames from various digital cameras:  DSC_1234.jpg,
> CRW1234.jpg, etc.  What I really want is the numeric portion of that
> filename.  How would I extract just that portion?

Some solutions have been posted already, but here's mine:

  irb(main):001:0> s="DSC_1234.jpg"
  => "DSC_1234.jpg"
  irb(main):002:0> s.sub(/\D+(\d+).*/,'\1')
  => "1234"

basicially the regexp looks for :

  - one or more non-digits
  - one or more digits => because this is between parenthesis you can
refer to
    it with \1 later on
  - something more

The digits (safely stored in \1) is all you want to keep... this assumed
you
are only interested in the first sequence of numbers.

Cheers

  Bas

--
Bas van Gils <bas@van-gils.org>, http://www.van-gils.org
[[[ Thank you for not distributing my E-mail address ]]]

Quod est inferius est sicut quod est superius, et quod est superius est
sicut
quod est inferius, ad perpetranda miracula rei unius.
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2007-06-12 13:05
(Received via mailing list)
On 12.06.2007 09:32, come wrote:
>
> a[/DSC_(\d+)\.jpg/,1] (<=> a.match(/DSC_(\d+)\.jpg/)[1])
>
> or even : a[/\ADSC_(\d+)\.jpg\Z/,1]

Or even simpler

irb(main):001:0> "DSC_1234.jpg"[/\d+/]
=> "1234"
irb(main):002:0> Integer("DSC_1234.jpg"[/\d+/])
=> 1234

Kind regards

  robert
Ef3aa7f7e577ea8cd620462724ddf73b?d=identicon&s=25 Rob Biedenharn (Guest)
on 2007-06-12 16:35
(Received via mailing list)
On Jun 12, 2007, at 2:45 AM, Matt Jones wrote:

> I have filenames from various digital cameras:  DSC_1234.jpg,
> CRW1234.jpg, etc.  What I really want is the numeric portion of that
> filename.  How would I extract just that portion?
>
>
> I expect it to involve the regex /\d+/, but I'm unclear how to
> extract a
> portion of a string matching a regex.
>
> Thank you

Last November (2006), there was a series of postings to the Columbus
Ruby Brigade list beginning with:
http://groups.google.com/group/columbusrb/browse_frm/thread/
9c2e682f9926bad0

This was the pattern that I used when responding to Bill's code
because many of *my* pictures had names like "100_5142.jpg",
"100_5143.jpg", etc.

NUMBERED_FILE_PATTERN = %r{^(.*\D)?(\d+)(.+)$}

It became a constant since I used it in three places.

Rob Biedenharn    http://agileconsultingllc.com
Rob@AgileConsultingLLC.com
A82c098905b53ae9ff331e0a6ff08976?d=identicon&s=25 Matt Jones (blahblah)
on 2007-06-16 05:23
A big thanks to everybody and all the creative solutions!
This topic is locked and can not be replied to.