Forum: Ruby Scanning a string for decimal numbers

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
E17632fab00b930ea3b30e1b98f39675?d=identicon&s=25 Jeppe Jakobsen (Guest)
on 2006-02-04 23:20
(Received via mailing list)
Hi all, how do you scan a string and avoid getting my decimal numbers
divided into 2 numbers.

Example:

a = "24,4 + 55,2"
a.scan! (/\d+/)
puts a

my output for a will be:
24
4
55
2

But I want to keep my decimal numbers intact like this:
24,4
55,2


How do I solve this problem without putting the numbers into seperate
strings?
3ccecc71b9fb0a3d7f00a0bef6f0a63a?d=identicon&s=25 Kent Sibilev (Guest)
on 2006-02-04 23:38
(Received via mailing list)
"24,4 + 55,2".scan /[\d,]+/

Kent
Dc38ff4f679fe0651d9e86f36c9a7885?d=identicon&s=25 Ernest Ellingson (Guest)
on 2006-02-05 00:56
(Received via mailing list)
Kent Sibilev wrote:
>> a = "24,4 + 55,2"
>> 24,4
>> 55,2
>>
>>
>> How do I solve this problem without putting the numbers into seperate
>> strings?
>>
>>
>
>
try
a.scan!(/\d+,\d+/)
Ernie
Cb48ca5059faf7409a5ab3745a964696?d=identicon&s=25 unknown (Guest)
on 2006-02-05 03:48
(Received via mailing list)
On Sun, 5 Feb 2006, Ernest Ellingson wrote:

>>>
>>> But I want to keep my decimal numbers intact like this:
> try
> a.scan!(/\d+,\d+/)
> Ernie

careful.  you'll kill negatives.

-a
6111b4012d1401ca83fdcea6b1d71237?d=identicon&s=25 Antonio Cangiano (Guest)
on 2006-02-05 12:48
(Received via mailing list)
Jeppe Jakobsen wrote:
> 24
> strings?
Hi Jeppe,
you can use the following:

a.scan /[-+]?[0-9]*\,?[0-9]+/

which will take care of negative numbers as well.

HTH,
Antonio
6111b4012d1401ca83fdcea6b1d71237?d=identicon&s=25 Antonio Cangiano (Guest)
on 2006-02-05 13:04
(Received via mailing list)
> a.scan /[-+]?[0-9]*\,?[0-9]+/

Or to be less verbose you can use \d in place of [0-9] :-)

HTH
Antonio
7223c62b7310e164eb79c740188abbda?d=identicon&s=25 Xavier Noria (Guest)
on 2006-02-05 13:16
(Received via mailing list)
On Feb 5, 2006, at 13:03, Antonio Cangiano wrote:

>> a.scan /[-+]?[0-9]*\,?[0-9]+/
>
> Or to be less verbose you can use \d in place of [0-9] :-)

If you have Perl installed, the incantation

     perldoc -q float

gives some regexps for numbers.

-- fxn
3cb4fdcf13aad6a7dcae83876b0e784e?d=identicon&s=25 Josef 'Jupp' SCHUGT (Guest)
on 2006-02-11 23:43
(Received via mailing list)
Hi!

At Sun, 05 Feb 2006 11:43:52 +0000, Antonio Cangiano wrote:
> a.scan /[-+]?[0-9]*\,?[0-9]+/

Shouldn't that rather be the following?

a.scan /[-+]?([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?)/

Josef 'Jupp' Schugt
E17632fab00b930ea3b30e1b98f39675?d=identicon&s=25 Jeppe Jakobsen (Guest)
on 2006-02-12 00:12
(Received via mailing list)
Will that expression include both integers and decimal numbers?
25e11a00a89683f7e01e425a1a6e305c?d=identicon&s=25 Wilson Bilkovich (Guest)
on 2006-02-12 00:29
(Received via mailing list)
On 2/4/06, Jeppe Jakobsen <jeppe88@gmail.com> wrote:
> 24
> strings?
>
This should handle periods or commas as the separator.

a = "24,4 + 55,2 + 55 - 44,0"
   => "24,4 + 55,2 + 55 - 44,0"
a.scan /(\d+,?.?\d*)(?=\s|$)/
   => [["24,4"], ["55,2"], ["55"], ["44,0"]]
E17632fab00b930ea3b30e1b98f39675?d=identicon&s=25 Jeppe Jakobsen (Guest)
on 2006-02-12 00:49
(Received via mailing list)
Nice, that was the thing I was looking for :)

2006/2/12, Wilson Bilkovich <wilsonb@gmail.com>:
8342950f34a66d9ec7cfe10b33d5494c?d=identicon&s=25 Alexis Reigel (Guest)
on 2006-02-12 01:41
(Received via mailing list)
>
> This should handle periods or commas as the separator.
>
> a = "24,4 + 55,2 + 55 - 44,0"
>    => "24,4 + 55,2 + 55 - 44,0"
> a.scan /(\d+,?.?\d*)(?=\s|$)/
>    => [["24,4"], ["55,2"], ["55"], ["44,0"]]
>

Some problems here:
- signs are disregarded ("-24,4" becomes "24,4")
- Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,"
- "." should be escaped. As you used it here, it means "any character"
(except newline), so many invalid numbers are accepted (e.g. "24w"...)
- If something different from whitespace follows the number, it is not
or false accepted, e.g. "24.4." becomes "4." instead of "24.4"
- ...


Alexis.
E17632fab00b930ea3b30e1b98f39675?d=identicon&s=25 Jeppe Jakobsen (Guest)
on 2006-02-12 02:32
(Received via mailing list)
2006/2/12, Alexis Reigel <mail@koffeinfrei.org>:
> Some problems here:
>
>
>

Let me see if I got it right then. I'll like to use periods only for my
decimal numbers. I also need normal integers so 24. being accepted won't
matter. Will this fix the problems you presented?:
/[-+]?(\d+\.?\d*)(?=\s|$)/


I don't know if it takes care of the last problem, because I didn't
understand it.
E17632fab00b930ea3b30e1b98f39675?d=identicon&s=25 Jeppe Jakobsen (Guest)
on 2006-02-12 02:40
(Received via mailing list)
Seems I accidently got my text marked as a qoute in my last mail, so
I'll
just send it a again:

Let me see if I got it right then. I'll like to use periods only for my
decimal numbers. I also need normal integers so 24. being accepted won't
matter. Will this fix the problems you presented?:
/[-+]?(\d+\.?\d*)(?=\s|$)/


I don't know if it takes care of the last problem, because I didn't
understand it.


2006/2/12, Jeppe Jakobsen <jeppe88@gmail.com>:
25e11a00a89683f7e01e425a1a6e305c?d=identicon&s=25 Wilson Bilkovich (Guest)
on 2006-02-12 04:28
(Received via mailing list)
Well, that's what I get for dashing off a quick e-mail before dinner.
The last problem Alexis mentioned is caused by the overly-specific
lookahead at the end.  Here's a version that fixes that:

irb(main):013:0> a = '24.5 + 24 + 24. + 24.4.'
=> "24.5 + 24 + 24. + 24.4."
irb(main):014:0> a.scan /[-+]?(\d+(?:\.\d+)?)(?=[^\d])/
=> [["24.5"], ["24"], ["24"], ["24.4"]]
irb(main):015:0>

One of the characters '-' or '+', optionally
Followed by at least one digit.
Followed by an optional group containing a period, and one or more
digits.
The capturing group ends when the next character is something other
than a digit.

The (?:) mess is there so that '24.' doesn't end up with the period on
the end.
E17632fab00b930ea3b30e1b98f39675?d=identicon&s=25 Jeppe Jakobsen (Guest)
on 2006-02-12 21:41
(Received via mailing list)
Yes that worked, but I intend to convert the digits of my array to
floats,
and I get a NoMethodError on to_f now when I do this:

digits[0] = digits[0].to_f

I don't understand that :-/


2006/2/12, Wilson Bilkovich <wilsonb@gmail.com>:
25e11a00a89683f7e01e425a1a6e305c?d=identicon&s=25 Wilson Bilkovich (Guest)
on 2006-02-12 22:53
(Received via mailing list)
The scan process returns an array of arrays, so:
digits[0] is an Array containing '24.4'.
You could do:
digits.flatten!
just before digits[0], and get what you expect.
E17632fab00b930ea3b30e1b98f39675?d=identicon&s=25 Jeppe Jakobsen (Guest)
on 2006-02-12 22:59
(Received via mailing list)
ok, but I think but wouldn't this regex do the same for me?:

/[-+]?\d+\.?\d+/

Except that it will return an array containing my digit?

2006/2/12, Wilson Bilkovich <wilsonb@gmail.com>:
25e11a00a89683f7e01e425a1a6e305c?d=identicon&s=25 Wilson Bilkovich (Guest)
on 2006-02-12 23:15
(Received via mailing list)
Yes, as long as the numbers are always at least two digits.
3cb4fdcf13aad6a7dcae83876b0e784e?d=identicon&s=25 Josef 'Jupp' SCHUGT (Guest)
on 2006-02-13 23:28
(Received via mailing list)
Hi!

At Sun, 12 Feb 2006 08:12:47 +0900, Jeppe Jakobsen wrote:
>
> Will that expression include both integers and decimal numbers?

[-+]?([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?)

has two parts:

[-+]?
([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?)


The first one is an optional sign. The second one is an alternative
between to two cases:

[1-9]\d*(\,[0-9]+)?
0(\,[0-9]+)?

Let's first consider the first case

[1-9]\d*(\,[0-9]+)?

It has two parts, namely

[1-9]\d*
(\,[0-9])?

The first part by itself covers all integers larger than zero. The
overall expression additionally covers all floating point numbers
larger than 1.

Now the second case

0(\,[0-9]+)?

This one covers zero and all decimal numbers larger than 0 and smaller
than 1.

The regex I provided intentionally supports none of

[+-],\d+
[+-]0+\d+

You may as well use the shorter version

[-+]?(([1-9]\d*(\,\d+)?)|(0(\,\d+)?))

Wait a moment, I am not sure if that is correct. To be on the safe
side I'd rather use one of these where anything that follows the
optional sign has been put into another pair of parentheses:

[-+]?(([1-9]\d*(\,[0-9]+)?)|(0(\,[0-9]+)?))
[-+]?((([1-9]\d*(\,\d+)?)|(0(\,\d+)?)))

I am one of those guys who sometime run out of placeholders when doing
search and replace in vim (which has nine of them).

Josef 'Jupp' Schugt
E17632fab00b930ea3b30e1b98f39675?d=identicon&s=25 Jeppe Jakobsen (Guest)
on 2006-02-14 19:09
(Received via mailing list)
Thank you for clearing things, up for me, but could you explain what the
last part of the expression Wilson provided me with means?

it's (?=[^\d])


2006/2/13, Josef 'Jupp' SCHUGT <jupp@gmx.de>:
430ea1cba106cc65b7687d66e9df4f06?d=identicon&s=25 David Vallner (Guest)
on 2006-02-14 20:46
(Received via mailing list)
DÅ?a Utorok 14 Február 2006 19:07 Jeppe Jakobsen napísal:
> Thank you for clearing things, up for me, but could you explain what the
> last part of the expression Wilson provided me with means?
>
> it's (?=[^\d])
>

That's a positive zero-width lookahead. I think. Gotta love regexspeak.

In English: look for a single character that's not a decimal digit, and
don't
include it in the match.

David Vallner
5befe95e6648daec3dd5728cd36602d0?d=identicon&s=25 Robert Klemme (Guest)
on 2006-02-15 10:28
(Received via mailing list)
David Vallner wrote:
> DÅ?a Utorok 14 Február 2006 19:07 Jeppe Jakobsen napísal:
>> Thank you for clearing things, up for me, but could you explain what
>> the last part of the expression Wilson provided me with means?
>>
>> it's (?=[^\d])

These is equivalent (?=\D)

A negative lookahead might work, too: (?!\d)

> That's a positive zero-width lookahead. I think. Gotta love
> regexspeak.
>
> In English: look for a single character that's not a decimal digit,
> and don't include it in the match.

I'd go with this quite simple regexp

/[-+]?\d+(?:,\d+)?/

If numbers like "1," should be detected, too, then just change the "+"
in
the last group to "*".

If one wants to prevent to match numbers with leading zeros then it
becomes more complicated but it seems not be worth the effort in this
case.

Kind regards

    robert
This topic is locked and can not be replied to.