Regexp help

Hello everyone,

I have a string of the form

2h 3m

or

3m 2h

or

2h 3minutes

or

2hour 3min

and so on

Is there a smart regexp one liner that could produce

[2, 3]

If anyone types just for example

2

than that should produce [2]

for any of the above input? I know that there will be an m or an h.

/Marcus

Hello

[2, 3]
If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found… That rules it out.

By the way, would it be difficult to implement named capturing groups
in regular expressions ? Would that interest someone ?

Cheers !

Vince

Not so difficult, but it’s not, as far as I can see, a
one liner. I am working something up at the moment
using an array of regexps.

— Vincent F. [email protected]

ah neat, Jordan, and more elegant than parsing an
arrayh of regexps:)

Marcus B. wrote:

Is there a smart regexp one liner that could produce

[2, 3]

r = Regexp.new(/(\d+)h.*(\d+)m/)
s1 = “2h 3m”
s2 = “2h 3minutes”
s3 = “2hour 3min”
m = r.match(s1)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s2)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s3)
p [m[1].to_i, m[2].to_i] # => [2, 3]

Regards,
Jordan

Hi,

2h 3m

2

than that should produce [2]

for any of the above input? I know that there will be an m or an h.

/Marcus

str = “2h 3m” # or somthing
str.scan(/(\d+)(\w*)/).sort_by{|x|x[1]}.collect{|x|x[0].to_i}

Regards,

Park H.

But, of course, that won’t capture “3m 2h”, like you described…

On 9/29/06, Park H. [email protected] wrote:

str = “2h 3m” # or somthing
str.scan(/(\d+)(\w*)/).sort_by{|x|x[1]}.collect{|x|x[0].to_i}

Regards,

Park H.

Nice one, thanks a lot!

/Marcus

Tom A. wrote:

But, of course, that won’t capture “3m 2h”, like you described…

True…

So:

r = Regexp.new(/(\d+)h?m?.*(\d+)m?h?/)

'Course, then you’ll have [3, 2] for the edge case rather than [2,
3]…but to get the full functionality that the OP described (including
the case where just “2” is given), you’d need fancier logic than just
regexp anyhow.

Regards,
Jordan

Vincent F. a écrit :

Is there a smart regexp one liner that could produce
re = Regexp.new(/(\d+)h.(\d+)m|(\d+)m.(\d+)h/)

Vince

And the one-liner :

$ irb

“3m
2h”.scan(/(\d+)h.(\d+)m|(\d+)m.(\d+)h/).flatten.values_at(0,1,3,2).compact
=> [“2”, “3”]

“2h
3m”.scan(/(\d+)h.(\d+)m|(\d+)m.(\d+)h/).flatten.values_at(0,1,3,2).compact
=> [“2”, “3”]

It’s possible to add .map { |i| i.to_i } at the end of this one-liner if
the result array must contain integers instead of strings.

Hello again !

[2, 3]

If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found… That rules it out.

Well, just to contradict myself, although this is no one-liner:

def scan(str)
re = Regexp.new(/(\d+)h.(\d+)m|(\d+)m.(\d+)h/)
if m = re.match(str)
return [m[1], m[2]] if m[1]
return [m[4], m[3]]
end
end

p scan(“2h 3m”)
p scan(“3m 2h”)

Cheers !

Vince

And the one-liner :

$ irb

“3m
2h”.scan(/(\d+)h.(\d+)m|(\d+)m.(\d+)h/).flatten.values_at(0,1,3,2).compact

That’s a nice one !

Vince

I have a string of the form
[…]

Is there a smart regexp one liner that could produce

Hello Marcus,

here’s my take on it:

times = %w{ 2hour3min 2h3minutes 3m2h 2h3m }
=> [“2hour3min”, “2h3minutes”, “3m2h”, “2h3m”]

times.map{ |t| [t[/\d+h(a-z)/].to_i, t[/\d+m(a-z)/].to_i] }
=> [[2, 3], [2, 3], [2, 3], [2, 3]]

Probably a little slower than the other solutions but perhaps easier to
grasp.

Regards
Matthias

Park H. schrieb:

str = “2h 3m” # or somthing
str.scan(/(\d+)(\w*)/).sort_by{|x|x[1]}.collect{|x|x[0].to_i}

Very nice idea, Park! I wouldn’t have thought of that. Slightly shorter:

str.scan(/(\d+)(\w)/).sort_by{|n,u|u}.map{|n,u|n.to_i}

Regards,
Pit

On Fri, 29 Sep 2006, Vincent F. wrote:

Is there a smart regexp one liner that could produce

[2, 3]

If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found… That rules it out.

irb> a
=> [“2h 3m”, “3m 2h”, “2h 3minutes”, “2hour 3min”, “2”]
irb> re
=> /(?=.\b(\d+)(?=h|\b))(?=.\b(\d+)m|)/
irb> a.map {|x| x.match(re).captures}
=> [[“2”, “3”], [“2”, “3”], [“2”, “3”], [“2”, “3”], [“2”, nil]]

On Sep 30, 2006, at 2:55 AM, Relm wrote:

As far as I know, you can only do that in C#, which has named
capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found… That rules it out.

Python regexps have named capturing groups. It’s extremely helpful
if you need to construct complicated patterns; because the index of
each capturing group can eaasily change when you add and remove
things in the regexp.

Tom

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs