Forum: Ruby regular expressions question

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-14 22:04
(Received via mailing list)
hello,

i need to capture all matches for a group. for example if

'ab c' =~ /^(.)*$/

i would like to get array [ 'a', 'b', ' ', 'c' ]

could not figure out how to do it in ruby. String#scan did not seem to
be the right thing. please help.

thanks
konstantin
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2005-12-14 22:17
(Received via mailing list)
On Dec 14, 2005, at 3:02 PM, ako... wrote:

> hello,
>
> i need to capture all matches for a group. for example if
>
> 'ab c' =~ /^(.)*$/
>
> i would like to get array [ 'a', 'b', ' ', 'c' ]
>
> could not figure out how to do it in ruby. String#scan did not seem to
> be the right thing. please help.

When using scan(), you need to remove the anchoring:

 >> "ab c".scan(/./)
=> ["a", "b", " ", "c"]

Hope that helps.

James Edward Gray II
82e62c756d89bc6fa0a0a2d7f2b1e617?d=identicon&s=25 Ross Bamford (Guest)
on 2005-12-14 22:35
(Received via mailing list)
On Wed, 14 Dec 2005 21:00:56 -0000, ako... <akonsu@gmail.com> wrote:

> i need to capture all matches for a group. for example if
>
> 'ab c' =~ /^(.)*$/
>
> i would like to get array [ 'a', 'b', ' ', 'c' ]
>

You could try:

irb(main):001:0> "ab c".split('')			# split on nothing
=> ["a", "b", " ", "c"]

irb(main):002:0> "ab c".split(//)			# same again
=> ["a", "b", " ", "c"]

irb(main):003:0> "ab c".scan(/./)			# scan on any single char
=> ["a", "b", " ", "c"]
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-14 22:38
(Received via mailing list)
thank you. this was just an example. in general, is it possible to get
a collection of captures for a group without having to write custom
code?
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-14 23:05
(Received via mailing list)
thank you. the question is general.

if i wanted to parse a list of letters separated by spaces and commas:

'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/

i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
can split, then massage the result some more and get the final result.
is there a way to get to groups' captures after a regex match? like in
microsoft's .net?
82e62c756d89bc6fa0a0a2d7f2b1e617?d=identicon&s=25 Ross Bamford (Guest)
on 2005-12-14 23:05
(Received via mailing list)
On Wed, 14 Dec 2005 21:34:52 -0000, ako... <akonsu@gmail.com> wrote:

> thank you. this was just an example. in general, is it possible to get
> a collection of captures for a group without having to write custom
> code?
>

Have to admit I'm not exactly a regex wiz, but I imagine it can be done
somehow. I assume you mean having a repeated capturing group append to
an
array any number of times?

But, I still think scan is a good tool for the job, it can do any regexp
anyway. I don't think a single regexp is really intended for doing
variable numbers of captures anyway (?) ).

irb(main):054:0> "ab c".scan(/\w|\s/)
=> ["a", "b", " ", "c"]

or

irb(main):052:0> "this is a test".scan(/\w+/)
=> ["this", "is", "a", "test"]

or even

irb(main):053:0> "this is a test".scan(/\w+|\s/)
=> ["this", " ", "is", " ", "a", " ", "test"]

Cheers,
Ross
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2005-12-14 23:23
(Received via mailing list)
On Dec 14, 2005, at 4:03 PM, ako... wrote:

> thank you. the question is general.
>
> if i wanted to parse a list of letters separated by spaces and commas:
>
> 'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
>
> i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
> can split, then massage the result some more and get the final result.
> is there a way to get to groups' captures after a regex match? like in
> microsoft's .net?

Perl-style variables:

 >> "abc" =~ /(.)(.)(.)/
=> 0
 >> p [$1, $2, $3]
["a", "b", "c"]
=> nil

Or object oriented:

 >> md = "abc".match(/(.)(.)(.)/)
=> #<MatchData:0x325dc8>
 >> p [md[1], md[2], md[3]]
["a", "b", "c"]
=> nil

Hope that helps.

James Edward Gray II
82e62c756d89bc6fa0a0a2d7f2b1e617?d=identicon&s=25 Ross Bamford (Guest)
on 2005-12-15 01:00
(Received via mailing list)
On Wed, 14 Dec 2005 21:59:27 -0000, ako... <akonsu@gmail.com> wrote:

>
I don't really get what you mean. I don't understand the rules that got
a
and b into one group and c into another. When you say it's a general
question, do you mean you just want access to the captures from some
regexp match?

irb(main):009:0> "a , b,c" =~ /(\w\s*?,\s*?\w)\s*?,\s*?(\w)/
=> 0
irb(main):010:0> $1
=> "a , b"
irb(main):011:0> $2
=> "c"
irb(main):012:0> $~[1]
=> "a , b"
irb(main):013:0> $~[2]
=> "c"
irb(main):014:0> md = /(\w\s*?,\s*?\w)\s*?,\s*?(\w)/.match("a, b,c")
=> #<MatchData:0xb7a47860>
irb(main):015:0> md[1]
=> "a, b"
irb(main):016:0> md.captures[1]
=> "c"
irb(main):017:0> $~.inspect
=> "#<MatchData:0xb7a47860>"

(and others...)

Hope that helps,
Ross
A7c9c275318af9e1e3812fab9660cd7c?d=identicon&s=25 Jeff Wood (Guest)
on 2005-12-15 01:18
(Received via mailing list)
You should be able to tell who this message is meant for:

PLEASE stop sending out code that uses any of the perl ${x} variables
...

They are ugly and have no place in Ruby ... they are only provided to
make the transition of Perl people easier ...

Please teach people to use MatchData objects ...

my_regex = /(\w\s*?.\s*?\w)\s*?.\s*?(\w)/

matches = my_regex.match( "a , b,c" )

element 0 of the matches object will contain the complete matched
string.

each element after that will map to one of the groups you defined ...

so:

matches[0] will be the whole string
"a , b,c"
matches[1] will be your first group
"a , b"
matches[2] will be your second group
"c"

... seriously, we're not helping people make cleaner code when we show
approval for the ugly/evil ${x} warts we've kept from Perl.

... show people the beauty and cleanliness of using an OOP solution ...

I hope you agree.

j.

On 12/14/05, Ross Bamford <rosco@roscopeco.remove.co.uk> wrote:
> > is there a way to get to groups' captures after a regex match? like in
> irb(main):010:0> $1
> => "a, b"
> --
> Ross Bamford - rosco@roscopeco.remove.co.uk
> "\e[1;31mL"
>
>


--
"Remember. Understand. Believe. Yield! -> http://ruby-lang.org"

Jeff Wood
2ee1a7960cc761a6e92efb5000c0f2c9?d=identicon&s=25 William James (Guest)
on 2005-12-15 02:03
(Received via mailing list)
ako... wrote:
> thank you. the question is general.
>
> if i wanted to parse a list of letters separated by spaces and commas:
>
> 'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
>
> i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
> can split, then massage the result some more and get the final result.
> is there a way to get to groups' captures after a regex match? like in
> microsoft's .net?

t = 'a , b,c'.split( /\s*,\s*/ )
group1 = t[0..-2]
group2 = t[-1,1]
82e62c756d89bc6fa0a0a2d7f2b1e617?d=identicon&s=25 Ross Bamford (Guest)
on 2005-12-15 02:09
(Received via mailing list)
On Thu, 15 Dec 2005 00:16:52 -0000, Jeff Wood <jeff.darklight@gmail.com>
wrote:

> You should be able to tell who this message is meant for:
>

Why not just address me directly?

> PLEASE stop sending out code that uses any of the perl ${x} variables ...
>

Well, okay. No need to shout though, is there?

Just trying to put a bit back, you know?
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2005-12-15 02:27
(Received via mailing list)
From: "Jeff Wood" <jeff.darklight@gmail.com>
>
> PLEASE stop sending out code that uses any of the perl ${x} variables ...
>
> They are ugly and have no place in Ruby ... they are only provided to
> make the transition of Perl people easier ...

Thankfully, this is Ruby, and not Python with its rigid
Only One Way mentality.

Myself, though I've been aware of MatchData for going on
five years now, I find I don't use it that often.  The
$1..$n variables are perfectly legible to me.  They have
a fine history too: not just Perl but awk, and Unix shell
programming . . .



Regards,

Bill
2ee1a7960cc761a6e92efb5000c0f2c9?d=identicon&s=25 William James (Guest)
on 2005-12-15 02:30
(Received via mailing list)
Ross Bamford wrote:
>
> Well, okay. No need to shout though, is there?
>
> Just trying to put a bit back, you know?

Ross, don't pay too much attention to unreasonable fanatics.

The first edition of the Pickaxe says:

"Having said all this, we have to 'fess up.  Andy and Dave normally
use the $-variables rather than worrying about MatchData objects.
For everyday use, they just end up being more convenient.
Sometimes we just can't help being pragmatic."
902654bac6dff9567f018bd2ed933151?d=identicon&s=25 Nicholas Van Weerdenburg (Guest)
on 2005-12-15 02:54
(Received via mailing list)
On 12/14/05, Jeff Wood <jeff.darklight@gmail.com> wrote:
> my_regex = /(\w\s*?.\s*?\w)\s*?.\s*?(\w)/
> "a , b,c"
> I hope you agree.
> > > 'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
> > question, do you mean you just want access to the captures from some
> > irb(main):013:0> $~[2]
> > (and others...)
>
> --
> "Remember. Understand. Believe. Yield! -> http://ruby-lang.org"
>
> Jeff Wood
>
>
Regular expressions is the only area I still use Perl magic variables
because it's concise, readable, and works well in that context. It feels
like a regexp standard to me.

The other magic variables I've dispensed with.

Nick
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-15 03:00
(Received via mailing list)
i give up. there seems to be no way to get all the captures for a
group. the corresponding $ variable just has the last one. thanks to
everyone who responded. sorry, did not mean to start a war over
people's coding styles.

konstantin
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2005-12-15 03:06
(Received via mailing list)
Hi,

From: "ako..." <akonsu@gmail.com>
>
>i give up. there seems to be no way to get all the captures for a
> group. the corresponding $ variable just has the last one.

Could you help us to understand why #scan didn't meet your needs?

Called without a block, #scan returns an array of matches:

>> "abc--------abc--------abc".scan(/(a)(b)(c)/)
=> [["a", "b", "c"], ["a", "b", "c"], ["a", "b", "c"]]

Called with a block, #scan calls your block each time a match is
found:

>> "abc--------abc--------abc".scan(/(a)(b)(c)/) { puts "#$1, #$2, #$3" }
a, b, c
a, b, c
a, b, c


Hope this helps,

Bill
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-15 03:15
(Received via mailing list)
Bill,

scan does not help because it can match a portion of the source string,
and what is in between the matches is skipped. so scan is just a
special case of the functionality that i was looking for. i need to
make sure the whole string has a defined structure and get parts of it
as groups.

konstantin
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-15 03:24
(Received via mailing list)
William James wrote:
> Ross, don't pay too much attention to unreasonable fanatics.
>
> The first edition of the Pickaxe says:
>
> "Having said all this, we have to 'fess up.  Andy and Dave normally
> use the $-variables rather than worrying about MatchData objects.
> For everyday use, they just end up being more convenient.
> Sometimes we just can't help being pragmatic."

How convenient that you quote that without quoting the drawbacks listed
first...

Sheesh, if you want perl, use perl.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2005-12-15 03:46
(Received via mailing list)
From: "ako..." <akonsu@gmail.com>
>
> scan does not help because it can match a portion of the source string,
> and what is in between the matches is skipped. so scan is just a
> special case of the functionality that i was looking for. i need to
> make sure the whole string has a defined structure and get parts of it
> as groups.

Ah, OK thanks.  From your earlier post:

> if i wanted to parse a list of letters separated by spaces and commas:
>
> 'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
>
> i need to get ['a','b'] in group 1 and ['c'] in group 2.

What about:

'a , b,c' =~ /^((?:\w\s*,\s*)*)(\w)$/
last_match = $2
first_matches = $1.scan(/\w/)

Since we first verified the whole string conforms to the required
pattern, we can then safely perform the scan on the captured group
to obtain the individual matches.


Or we could write the scan using look-ahead assertions, as another
way to prevent the skipping of in-between parts:

str = 'a , b,c'
# first verify whole pattern matches, and get final match group
if str =~ /^(?:\w\s*,\s*)*(\w)$/
  last_match = $1
  first_matches =
str.scan(/(?:(\w)\s*,\s*)(?=(?:\w\s*,\s*)*\w$)/).flatten
end

# last_match => "c"
# first_matches => ["a", "b"]


HTH,

Bill
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-15 03:55
(Received via mailing list)
thank you. yes, it seems to be the only way. just that it is a shame
that we have to match the same expression again! the information was
available already, it was just discarded during the first match in your
sample.

konstantin
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2005-12-15 04:37
(Received via mailing list)
On Dec 14, 2005, at 6:16 PM, Jeff Wood wrote:

> You should be able to tell who this message is meant for:

Yes, I recognize that you are probably speaking at least in part to
me, since I did that in this very thread.  You can call me by name if
you like.  I'm a big boy and I can take it.  ;)

> PLEASE stop sending out code that uses any of the perl ${x}
> variables ...

Hang on there Mr. Code Police.  Let's not lay down the law down too
heavily before we get into this...

> They are ugly and have no place in Ruby ... they are only provided to
> make the transition of Perl people easier ...

I seriously doubt those variables were invented in Perl.  They are a
common feature to many Regular Expression implementation and I'm not
sure they are even that ugly.  $1 holds what was grabbed by the first
set of parenthesis.  Fairly logical.

> Please teach people to use MatchData objects ...

I also showed a MatchData example.

I've used them a time or two, but honestly, they just don't feel
right to me.  I've stopped using the default variable, I'm using a
two-space tab, etc.  I'm Ruby assimilated, but I just like the Regexp-
linked variables.

I see a lot of code running the Ruby Quiz and I feel quite confident
saying that the Regexp variables are far more common than MatchData.
I don't think that says anything bad about the latter, but it does
tell me that you are in the minority.  ;)

We won't yell at you for using MatchData, if you'll provide the same
consideration...

James Edward Gray II
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-15 04:43
(Received via mailing list)
James Edward Gray II wrote:
> I see a lot of code running the Ruby Quiz and I feel quite confident
> saying that the Regexp variables are far more common than MatchData.   I
> don't think that says anything bad about the latter, but it does  tell
> me that you are in the minority.  ;)

If the majority here uses globally scoped variables to store locally
used values,  then that doesn't say anything good for this portion of
the Ruby community.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2005-12-15 04:55
(Received via mailing list)
On Dec 14, 2005, at 9:42 PM, Neil Stevens wrote:

> the Ruby community.
I really think you are blowing the issue out of proportion.  A Regexp
is generally checked in one-line and the variables used in the next.
It doesn't make a lot of sense to hold on to them for twenty lines
after you make the check.

Also, I believe they are thread-local variables are they not?  (I'm
honestly asking.)  If so, I don't see a lot of concern about them
being stomped on before they are used.

James Edward Gray II
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-15 05:04
(Received via mailing list)
James Edward Gray II wrote:
> Also, I believe they are thread-local variables are they not?  (I'm
> honestly asking.)  If so, I don't see a lot of concern about them  being
> stomped on before they are used.

Actually, I just looked it up, and according to "Programming Ruby,"
$1-$9 are "local to the current scope."

My mistake, heh.  I wonder how many who use them know that, though, and
how many just do it without checking because it's popular in perl or
popular on here.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
A7c9c275318af9e1e3812fab9660cd7c?d=identicon&s=25 Jeff Wood (Guest)
on 2005-12-15 05:13
(Received via mailing list)
Ross Bamford wrote:

>>
>
> Well, okay. No need to shout though, is there?
>
> Just trying to put a bit back, you know?
>
Because it wasn't *just* directed at you ... there have been other posts
that included those same "warts".

And I didn't mean to shout, I just meant to exaggerate the please ....
*Please* would have been more appropriate.

j.
A7c9c275318af9e1e3812fab9660cd7c?d=identicon&s=25 Jeff Wood (Guest)
on 2005-12-15 05:22
(Received via mailing list)
James Edward Gray II wrote:

>> variables ...
> common feature to many Regular Expression implementation and I'm not
> two-space tab, etc.  I'm Ruby assimilated, but I just like the Regexp-
> James Edward Gray II
>
>
>
Quite simply, Ruby is *supposed* to be about consistency ... Having the
"everything is an object, principal of least surprise" mantra, then
using these which act like a global ( $ ) but aren't actually ( local
scope ) is just vile.

That's why I have a problem with them.  If the community uses them,
well, that's their option, I'm just one that's all for consistency,
always, as much as possible... It tends to make things more generic and
able to handle change better.

I can't speak as to whether Perl was the first language to do the ${x}
variables ... but, it ( so far of the languages I've learned ) uses it
heavily, and it contributes to all of the punctuation soup that we all
left Perl to get away from... ( again I'm speaking generally, but I
could also again be wrong ... that's one of the uglies I left because of
).

... Not trying to be language police, I just really love the MatchData,
and find it MUCH easier to deal with.  Then you can keep your datasets
from multiple matches around ... to me it *is* easier to read ...
instead of ... $1 ... where'd that come from ... I didn't assign a
glob... oh that's right ...

Anyways, I'm sorry to have causes this thread to go on this long ... I
just really thought more of the people on the list would step up and
say, yeah, those are some very ugly warts and we don't use them ... but
apparently, I was wrong.

I'll shut up now.

j.
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-15 05:25
(Received via mailing list)
Neil Stevens wrote:
> My mistake, heh.  I wonder how many who use them know that, though, and
> how many just do it without checking because it's popular in perl or
> popular on here.

Hold on, this wasn't really my mistake, I think.  How is one supposed to
know a dollar-sign variable isn't always global?

This sounds to me like some special-case hackery done to keep careless
coders from shooting themselves in the foot.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2005-12-15 08:55
(Received via mailing list)
Neil Stevens wrote:
> This sounds to me like some special-case hackery done to keep careless
> coders from shooting themselves in the foot.
>

Usually if you think about what the variable represents, it's obvious
whether it should be thread-local or global, and ruby does it that way.
(Results of something the thread did, like call an external process, or
match a regex--those are thread local. Environment that was given when
the program started--those are global.) The local/global distinction is
not hackery, but the notation (inherited from perl) is not great.

I do kinda wish there was a consistent visual cue of some kind, like
$$foo for global and $foo for local, or $foo for global and $_foo for
local. It would also be nice to have a faster way to access user-defined
thread vars: $_foo versus Thread.current[:foo].
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2005-12-15 09:34
(Received via mailing list)
From: "Neil Stevens" <neil@hakubi.us>
>
> Hold on, this wasn't really my mistake, I think.  How is one supposed to
> know a dollar-sign variable isn't always global?
>
> This sounds to me like some special-case hackery done to keep careless
> coders from shooting themselves in the foot.

I think it's more like when the intrepid Ruby nuby first notices
a method not suffixed with ! that modifies the receiver--and
posts to the list: This is inconsistent! This can't be right!
This violates POLS! etc.!  "All methods that modify the receiver
should end in !, right???"

And Matz points out that the rationale is somewhat different. . . .

Similarly, it doesn't seem reasonable to condemn method-local $1..$n
as special-cace hackery designed to benefit careless coders, so much
as Ruby behaving in the most naturally useful way possbile.

Huzzah!  &c.  :)


Regards,

Bill
82e62c756d89bc6fa0a0a2d7f2b1e617?d=identicon&s=25 Ross Bamford (Guest)
on 2005-12-15 09:43
(Received via mailing list)
On Thu, 15 Dec 2005 01:23:14 -0000, William James <w_a_x_man@yahoo.com>
wrote:

> The first edition of the Pickaxe says:
>
> "Having said all this, we have to 'fess up.  Andy and Dave normally
> use the $-variables rather than worrying about MatchData objects.
> For everyday use, they just end up being more convenient.
> Sometimes we just can't help being pragmatic."
>

:) Thanks. I don't feel nearly so bad about being too lazy to get a
MatchData now ;)
82e62c756d89bc6fa0a0a2d7f2b1e617?d=identicon&s=25 Ross Bamford (Guest)
on 2005-12-15 09:58
(Received via mailing list)
On Thu, 15 Dec 2005 03:35:15 -0000, James Edward Gray II
<james@grayproductions.net> wrote:

> On Dec 14, 2005, at 6:16 PM, Jeff Wood wrote:
>
>> You should be able to tell who this message is meant for:
>
> Yes, I recognize that you are probably speaking at least in part to me,
> since I did that in this very thread.  You can call me by name if you
> like.  I'm a big boy and I can take it.  ;)
>

(Sorry, James, I assumed that was directed at me alone)

>> PLEASE stop sending out code that uses any of the perl ${x} variables
>> ...
>
> [...]
>
>> Please teach people to use MatchData objects ...
>
> I also showed a MatchData example.

I think the $1, $2 stuff just hit Jeff's anger button, because I had
MatchData down at the bottom of my examples too but it seemed to be
missed.

>
> I've used them a time or two, but honestly, they just don't feel right
> to me.  I've stopped using the default variable, I'm using a two-space
> tab, etc.  I'm Ruby assimilated, but I just like the Regexp-linked
> variables.
>

Same here - I like to stick to the convention of whatever language I'm
using. To begin with I did use a lot more Perl-like stuff (I started
with
Ruby using an old Perl book. Don't ask...) but now i'm starting to find
the Rubyisms that work for me ($LOAD_PATH instead of $: and so on).

It seems sensible to me to save the MatchData stuff for the times when
it's needed. Java's regexp support is purely OOP-based (of course, being
Java ;)) and I can't remember when I last used it for something simple -
it's just nowhere near as convenient as matching a literal regexp and
using the numbered variables...

Cheers,
Ross
82e62c756d89bc6fa0a0a2d7f2b1e617?d=identicon&s=25 Ross Bamford (Guest)
on 2005-12-15 10:07
(Received via mailing list)
On Thu, 15 Dec 2005 04:10:17 -0000, Jeff Wood <jeff.darklight@gmail.com>
wrote:

>>> PLEASE stop sending out code that uses any of the perl ${x} variables
>>> ...
>>>
>>
>> Well, okay. No need to shout though, is there?
>>
>> Just trying to put a bit back, you know?
>>
> Because it wasn't *just* directed at you ... there have been other posts
> that included those same "warts".
>

Ok, sorry. I'm kind of paranoid ;)

> And I didn't mean to shout, I just meant to exaggerate the please ....
> *Please* would have been more appropriate.
>

I wondered that after I posted. Just my old Fidonet reflexes kicking in
I
suppose :)

Just to say, though, that I tend to feel when someone asks a question,
that I should give them all the alternative answers I have, and let them
choose what suits them. If we really don't want people to use something
(which I don't agree with in this case btw) then the way to do that is
to
make sure they completely understand that thing - only then can they
choose for themselves whether it's good or not.

(I'm thinking specifically about the anti-pattern wars that have
consumed
a lot of good work in Java over the past few years).

So anyway, that's what I did. It's just natural I think to start with
the
shortest way ($1), move to the longer ways ($~[1], $~.captures[0]) and
finally the 'long' way (/.../.match etc). I did slip in a subtle hint at
the end that MatchData was worth looking up, with the $~.inspect line...

Cheers,
Ross
5befe95e6648daec3dd5728cd36602d0?d=identicon&s=25 Robert Klemme (Guest)
on 2005-12-15 11:35
(Received via mailing list)
ako... wrote:
> thank you. yes, it seems to be the only way. just that it is a shame
> that we have to match the same expression again! the information was
> available already, it was just discarded during the first match in
> your sample.

I still didn't get what exactly you want.  Does this help?

>> 'a,b ,c'.split /\s*,\s*/
=> ["a", "b", "c"]

Kind regards

    robert
3a0d61d465c296f7f291b5bd09c90a4a?d=identicon&s=25 tony summerfelt (Guest)
on 2005-12-15 15:45
(Received via mailing list)
Ross Bamford wrote on 12/14/2005 4:32 PM:
>> i need to capture all matches for a group. for example if

>> 'ab c' =~ /^(.)*$/

> You could try:

a regex tool i'm finding invaluable is "redet"  (on freshmeat)

works with a number of languages including ruby...
912c61d9da47754de7039f4271334a9f?d=identicon&s=25 unknown (Guest)
on 2005-12-15 17:27
(Received via mailing list)
Quoting Neil Stevens <neil@hakubi.us>:

> If the majority here uses globally scoped variables to store
> locally used values,  then that doesn't say anything good for
> this portion of the Ruby community.

Were you aware that $1 and friends aren't actually globally scoped?

-mental
7264fb16beeea92b89bb42023738259d?d=identicon&s=25 Christian Neukirchen (Guest)
on 2005-12-15 18:03
(Received via mailing list)
Neil Stevens <neil@hakubi.us> writes:

> How convenient that you quote that without quoting the drawbacks listed
> first...

Which drawbacks?
912c61d9da47754de7039f4271334a9f?d=identicon&s=25 unknown (Guest)
on 2005-12-15 18:12
(Received via mailing list)
Quoting James Edward Gray II <james@grayproductions.net>:

> Also, I believe they are thread-local variables are they not?
> (I'm honestly asking.)  If so, I don't see a lot of concern
> about them being stomped on before they are used.

They're method-local.

def foo
  "abc" =~ /(a)/
  p $1
end

foo => "a"
p $1 => nil

-mental
7264fb16beeea92b89bb42023738259d?d=identicon&s=25 Christian Neukirchen (Guest)
on 2005-12-15 18:12
(Received via mailing list)
Neil Stevens <neil@hakubi.us> writes:

> popular on here.
Actually, these variables are thread-local in Perl too:

  use strict;
  use threads;

  "foo" =~ /(\w+)/;
  print($1, "\n");
  threads->new(sub {
  	       "bar" =~ /(\w+)/;
  	       print($1, "\n");
  	     })->join;
  print($1, "\n");

prints:

  foo
  bar
  foo
C5be24289f1471f3da84864a6677af12?d=identicon&s=25 Garance A Drosehn (Guest)
on 2005-12-15 18:46
(Received via mailing list)
On 12/15/05, Robert Klemme <bob.news@gmx.net> wrote:
> ako... wrote:
> > thank you. yes, it seems to be the only way. just that it is a shame
> > that we have to match the same expression again! the information was
> > available already, it was just discarded during the first match in
> > your sample.
>
> I still didn't get what exactly you want.  Does this help?
>
> >> 'a,b ,c'.split /\s*,\s*/
> => ["a", "b", "c"]

Now that I've read the responses in this thread a few times, I think
I understand what he wants to do.  And I don't think it can be done
via scan.

First:  He wants a single regex which will verify the syntax of an
entire line.  So, first he wants a true/false value, saying "The line
is valid, or it is not valid".  Never mind any values in the line, just
"is the line *completely valid*?".

Then, if the line is valid, he wants to break out individual pieces
of what was scanned, and he wants to do that without re-doing
any of the scans he did in the first regex.  The trick is that some
of those pieces are a repeating group, such as /(\s\w)*/.

What is confusing us is that he describes this using a simple
example, and when we solve the simple example he then says
"you don't get the bigger picture!".  Ugh.

Let me give an example, and see if someone can solve it.  My
example might still be something other than what he's thinking
of, but maybe it will help.

Let's say I'm expecting command lines of the form:
   first word is either 'copy' or 'duplicate'
   followed by one or more words
   followed by the word 'before' or 'after'
   followed by one or more words

So I could do the first step with the regexp:

  /^(copy|duplicate) \s+ (\w+\s+)+ (before|after) \s+ (\w+\s*)+ $/x

 (hopefully I've done that right!).  *IF* that matches, then I know
the entire line is valid.  Then, after I know the line is valid, I want
the array of source-words, and the array of destination-words
which were matched.  I want to do that by picking out information
in Matchdata, not by doing a new scan.  The thing is, I don't think
I have a way of knowing how many times the first '(\d+\s+)+' was
matched.  So I can't just do a slice of $~.captures because I don't
know what the starting and ending indexes of that slice would be.
I could put another set of parenthesis around the two repeating
groups:

  /^(copy|duplicate) \s+ ((\w+\s+)+) (before|after) \s+ ((\w+\s*)+) $/x

But that doesn't really give me two separate arrays of the
individual values that made up each group.  It just matches
each group as a whole.

Given two data lines of:
    copy apple pear plum peach after bill bob
    duplicate tomato before joe alice alfred tommy jane

in the first case I want a way to set two arrays:
    srcfood = ["apple ", "pear ", "plum ", "peach "]
    destword = ["bill ", "bob"]
from the first line, and
    srcfood = ["tomato "]
    destword = ["joe ", "alice", "alfred ", "tommy ", "jane"]
from the second line.

I'll agree this is a weird example, but I think it shows the issue.
If I apply the above pattern to the first line, I'll see a Matchdata
result where:

$~.captures ==
   ["copy", "apple pear plum peach ", "peach ", "after", "bill bob",
"bob"]

Notice: There isn't *any* element which contains a value of just "apple
",
or just "pear ", or just "plum ", even though the regex obviously had to
match each one of those.
2ee1a7960cc761a6e92efb5000c0f2c9?d=identicon&s=25 William James (Guest)
on 2005-12-15 20:28
(Received via mailing list)
Garance A Drosehn wrote:

> What is confusing us is that he describes this using a simple
>    followed by the word 'before' or 'after'
> in Matchdata, not by doing a new scan.  The thing is, I don't think
> each group as a whole.
>     destword = ["joe ", "alice", "alfred ", "tommy ", "jane"]
> from the second line.
>
> I'll agree this is a weird example, but I think it shows the issue.
> If I apply the above pattern to the first line, I'll see a Matchdata
> result where:
>
> $~.captures ==
>    ["copy", "apple pear plum peach ", "peach ", "after", "bill bob", "bob"]

DATA.each {|line|   line.chomp!
  md =
    /^(?:copy|duplicate) \s+
      ((?:\w+\s+)+)
      (?:after|before) \s+
      ((?:\w+\s*)+) $
    /x.match( line )
  p md.captures
  src_food = md.captures.first.split
  dest_word = md.captures.last.split
  p src_food, dest_word
}

__END__
copy apple pear plum peach after bill bob
duplicate tomato before joe alice alfred tommy jane

----- output: -----

["apple pear plum peach ", "bill bob"]
["apple", "pear", "plum", "peach"]
["bill", "bob"]
["tomato ", "joe alice alfred tommy jane"]
["tomato"]
["joe", "alice", "alfred", "tommy", "jane"]
4b174722d1b1a4bbd9672e1ab50c30a9?d=identicon&s=25 Ryan Leavengood (Guest)
on 2005-12-15 21:07
(Received via mailing list)
On 12/14/05, Jeff Wood <jeff.darklight@gmail.com> wrote:
> You should be able to tell who this message is meant for:
>
> PLEASE stop sending out code that uses any of the perl ${x} variables ...
>
> They are ugly and have no place in Ruby ... they are only provided to
> make the transition of Perl people easier ...

Just to add another voice to the maelstrom: I have never coded Perl
and don't particular like it, but I almost always use the $1 variables
in Ruby. I have been coding Ruby for over four years and I rarely find
instances where MatchData is all that much better than =~ and the $1
variables.

They are just more convenient in the typical case:

str="abc"
re=/(.)(.)(.)/

if re =~ str
	p $2
end

# Versus:

if (md = re.match(str))
	p md[2]
end

Ryan
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-15 21:14
(Received via mailing list)
Jeff Wood wrote:
> I'll shut up now.

I think that's exactly what some people want you to do.  People don't
want to be told that they're lousy coders, and their poor practices have
only been made to work through a special case in the language
interpreter.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
Dd76a12d66f843de5c5f8782668e7127?d=identicon&s=25 Mauricio Fernandez (Guest)
on 2005-12-15 21:23
(Received via mailing list)
On Fri, Dec 16, 2005 at 05:05:00AM +0900, Ryan Leavengood wrote:
> Just to add another voice to the maelstrom: I have never coded Perl
> and don't particular like it, but I almost always use the $1 variables
> in Ruby. I have been coding Ruby for over four years and I rarely find
> instances where MatchData is all that much better than =~ and the $1
> variables.
>
> They are just more convenient in the typical case:

MatchData can be quite convenient too

str = "foobar 2005-12-15"
if md = /(\S+) (\d+)-(\d+)-(\d+)/.match(str)
  name, year, month, day = md.captures             # => ["foobar",
"2005", "12", "15"]
  name                                             # => "foobar"
  # ....
end

If you want to name the captures,
    md.captures
looks better than
    $1, $2, $3, $4
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2005-12-15 21:23
(Received via mailing list)
On Dec 15, 2005, at 2:12 PM, Neil Stevens wrote:

> Jeff Wood wrote:
>> I'll shut up now.
>
> I think that's exactly what some people want you to do.  People don't
> want to be told that they're lousy coders, and their poor practices
> have
> only been made to work through a special case in the language
> interpreter.

Hey, can we keep this to a discussion of language features and stop
insulting large portions of the Ruby community?  Thank you.

James Edward Gray II
8e44c65ac5b896da534ef2440121c953?d=identicon&s=25 Ezra Zygmuntowicz (Guest)
on 2005-12-15 21:29
(Received via mailing list)
On Dec 15, 2005, at 12:12 PM, Neil Stevens wrote:

> Jeff Wood wrote:
>> I'll shut up now.
>
> I think that's exactly what some people want you to do.  People don't
> want to be told that they're lousy coders, and their poor practices
> have
> only been made to work through a special case in the language
> interpreter.
>
>

Can we please try to keep it civil in here? You go code how you want
and I will go code how I want and we can both be happy.

-Ezra Zygmuntowicz
Yakima Herald-Republic
WebMaster
http://yakimaherald.com
509-577-7732
ezra@yakima-herald.com
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-15 22:08
(Received via mailing list)
James Edward Gray II wrote:
> Hey, can we keep this to a discussion of language features and stop
> insulting large portions of the Ruby community?  Thank you.

You are the one who first suggested that more people are using these
special-case magic variables so wantonly in their code, by bringing up
what you said in the Ruby Quiz.

Yeah, it's clear that regardless of how nice the Ruby language is,
nothing can bring up the quality of the average Internet programmer,
unfortunately.  Better tools don't make a better carpenter.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
A70b7da5a3a712e800100e61ef8d8917?d=identicon&s=25 ako... (Guest)
on 2005-12-15 22:14
(Received via mailing list)
yes, thank you. this is a better description of the problem. i am not a
native english speaker, so may be this is one of the reasons why my
question is not clear.

i saw a solution to this problem that uses split at the end. it of
course won't work if you change your example and allow quoted strings
in source-words and destination-words. a quoted string can contain
anything, spaces too and your keywords too, so the subsequent split
won't work.

well, i did not realise that the term "group's captures" is that rare.
i thought it was a standard term. but may be i am brainwashed by
microsoft. so i have this code in .net which might help to clarify what
i am talking about:

            string text = "One car red car blue car";
            string pat = @"^(?:(\w+)\s+)*(\w+)$";
            Regex r = new Regex(pat, RegexOptions.IgnoreCase);

            // Match the regular expression pattern against a text
string.
            Match m = r.Match(text);
            if (m.Success)
            {
                Console.WriteLine("match: [{0}]", m);
                foreach (Group g in m.Groups)
                {
                    Console.WriteLine("group: [{0}]", g);
                    foreach (Capture c in g.Captures)
                    {
                        Console.WriteLine("\tcapture: [{0}]", c);
                    }
                }
            }

the output is:

match: [One car red car blue car]
group: [One car red car blue car]
        capture: [One car red car blue car]
group: [blue]
        capture: [One]
        capture: [car]
        capture: [red]
        capture: [car]
        capture: [blue]
group: [car]
        capture: [car]

as you see, the first group is $0, the second group is $1, and the
third is $2. but $1 and $2 contain captures too. it is like if $1 and
$2 were arrays in Ruby.

in my opinion this is a big limitation of ruby's regular expressions.
it just must be as powerful as .net ; -)

konstantin
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2005-12-15 22:26
(Received via mailing list)
On Dec 15, 2005, at 3:07 PM, Neil Stevens wrote:

> James Edward Gray II wrote:
>> Hey, can we keep this to a discussion of language features and stop
>> insulting large portions of the Ruby community?  Thank you.
>
> You are the one who first suggested that more people are using these
> special-case magic variables so wantonly in their code, by bringing up
> what you said in the Ruby Quiz.

Since you are educating me about the joys of MatchData, allow me to
educate you on the differences of what we both said.  I explained
what I have seen and made statements about how common a given
practice is.  You were, and still are, insulting people you know
nothing about.

> Yeah, it's clear that regardless of how nice the Ruby language is,
> nothing can bring up the quality of the average Internet programmer,
> unfortunately.  Better tools don't make a better carpenter.

You are rude for absolutely no reason and you have ceased to add
anything to this conversation.  I'm done trying to reason with you.

James Edward Gray II
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-15 22:35
(Received via mailing list)
James Edward Gray II wrote:
> You are rude for absolutely no reason and you have ceased to add
> anything to this conversation.  I'm done trying to reason with you.

It's unfortunate that you think it's rude to point out facts that are
uncomfortable to you.  It's unfortunate that you're apparently seeing
personal attacks that don't exist.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
31e038e4e9330f6c75ccfd1fca8010ee?d=identicon&s=25 Gregory Brown (Guest)
on 2005-12-15 22:54
(Received via mailing list)
On 12/15/05, Neil Stevens <neil@hakubi.us> wrote:

> It's unfortunate that you think it's rude to point out facts that are
> uncomfortable to you.  It's unfortunate that you're apparently seeing
> personal attacks that don't exist.

And what "facts" have you mentioned?

To me, the magic variables fit fine with regex.

I find it really hard to understand why people are complaining about
using a little $1 after some string of absolutely cryptic pattern
matching sequences.

People tend to like Ruby's regex system because it's perl-like, no?

But in the interest of preserving TIMTOWTDI, there is an object
oriented solution which many posters have already mentioned.

James saying that it was common practice, and providing evidence of it
is much more reasonable than you simply insulting people you don't
know.

A quote from Arnold in Kindergarden Cop, "Stop Whining!" applies very
well here.
You're free to never use $1 .. $n, so I don't see what the issue is.
8e44c65ac5b896da534ef2440121c953?d=identicon&s=25 Ezra Zygmuntowicz (Guest)
on 2005-12-15 23:21
(Received via mailing list)
On Dec 15, 2005, at 1:32 PM, Neil Stevens wrote:

> Neil Stevens - neil@hakubi.us
kill-filed

-Ezra Zygmuntowicz
Yakima Herald-Republic
WebMaster
http://yakimaherald.com
509-577-7732
ezra@yakima-herald.com
A7c9c275318af9e1e3812fab9660cd7c?d=identicon&s=25 jeff.darklight@gmail.com (Guest)
on 2005-12-16 03:02
(Received via mailing list)
I know I said I'd shut up, and I am, but I did feel that after some of
the messages that have popped since mine, that I should have a final
comment...

I do apologize to you all for causing such grief.  I was simply trying
to add my $0.02 as what I thought was best practice ... and made the
assumption that most people were doing things that way ( I don't use
the ${x} vars at all, but I bet you knew that already )...

I'm not saying that people are bad coders for using those, I didn't
think I was being insulting in my OP. I was simply suggesting that I
thought it a better practice to show the most oop way of doing things
in code sent to newer programmers ...

Truly, if anybody took offense, it was unintentional.  If you knew me
better, you would understand that I tend to be passionate about things,
especially when it comes to teaching the next gen of programmers and/or
users of Ruby...

So, peace, love, and joyful programming to you all, I will try to be
more cautious with my posts in the future.

j.
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-16 03:11
(Received via mailing list)
Ezra Zygmuntowicz wrote:
> Can we please try to keep it civil in here? You go code how you want
> and I will go code how I want and we can both be happy.

Actually, no, you're wrong.  String's gsub, for one, has been hard-wired
to force you to use magic punctuation in order to access the match data.

This forces *everyone* to depend on the pseudo-global variable hacks
unless they run their Regexp multiple times:

str = 'root:*:0:0:System Administrator:/var/root:/bin/sh'
re = /([^:]+)(:|$)/
str.gsub!(re) do |m|
	matchData = re.match(m)
	'x' * matchData[1].length + matchData[2]
end
puts str
	=> 'xxxx:x:x:x:xxxxxxxxxxxxxxxxxxxx:xxxxxxxxx:xxxxxxx'

So the langauge is hard-coded to force everyone to use pseudo-global
magically-scoped variables, that you have to cross your fingers and hope
work the way you need at the moment.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
37a3c73ffbf864e4b28f7f2384ee12ce?d=identicon&s=25 Timothy Hunter (tim-hunter)
on 2005-12-16 03:20
(Received via mailing list)
Neil Stevens wrote:
>
> Actually, no, you're wrong.
>
plonk
1fba4539b6cafe2e60a2916fa184fc2f?d=identicon&s=25 unknown (Guest)
on 2005-12-16 03:35
(Received via mailing list)
Hi --

On Fri, 16 Dec 2005, Neil Stevens wrote:

> Ezra Zygmuntowicz wrote:
>> Can we please try to keep it civil in here? You go code how you want
>> and I will go code how I want and we can both be happy.
>
> Actually, no, you're wrong.  String's gsub, for one, has been hard-wired
> to force you to use magic punctuation in order to access the match data.
>
> This forces *everyone* to depend on the pseudo-global variable hacks
> unless they run their Regexp multiple times:

You can minimize it by using $~.


David

--
David A. Black
dblack@wobblini.net

"Ruby for Rails: Ruby techniques for Rails developers",
coming April 2006 (http://www.manning.com/books/black)
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2005-12-16 03:56
(Received via mailing list)
On Dec 15, 2005, at 7:47 PM, jeff.darklight@gmail.com wrote:

> I do apologize to you all for causing such grief.

I too apologize if I came back at you too harsh.  It was unintentional.

Group hug everyone!  ;)

James Edward Gray II
4b174722d1b1a4bbd9672e1ab50c30a9?d=identicon&s=25 Ryan Leavengood (Guest)
on 2005-12-16 04:57
(Received via mailing list)
On 12/15/05, jeff.darklight@gmail.com <jeff.darklight@gmail.com> wrote:
>
> I do apologize to you all for causing such grief.

You have been fairly civil. What I find most amusing is how Neil
started the flame-fest by agreeing to your self-imposed shutting up,
then ended up having the same opinion as you (that the $1 variables
are bad.) Oh the irony.

Ryan
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-16 05:24
(Received via mailing list)
Ryan Leavengood wrote:
> On 12/15/05, jeff.darklight@gmail.com <jeff.darklight@gmail.com> wrote:
>
>>I do apologize to you all for causing such grief.
>
>
> You have been fairly civil. What I find most amusing is how Neil
> started the flame-fest by agreeing to your self-imposed shutting up,
> then ended up having the same opinion as you (that the $1 variables
> are bad.) Oh the irony.

Actually, I think you should read more carefully what I wrote.  The
irony here isn't what you think.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
4b174722d1b1a4bbd9672e1ab50c30a9?d=identicon&s=25 Ryan Leavengood (Guest)
on 2005-12-16 05:36
(Received via mailing list)
On 12/15/05, Neil Stevens <neil@hakubi.us> wrote:
>
> Actually, I think you should read more carefully what I wrote.  The
> irony here isn't what you think.

Yes I see. Actually your statement can be interpreted either way,
based on context. Since I hadn't read the thread for several hours I
forgot which side of the debate you were on. So, when I read this:

"I think that's exactly what some people want you to do.  People don't
want to be told that they're lousy coders, and their poor practices have
only been made to work through a special case in the language
interpreter."

It sounded like you wanted Jeff to shut up, and that you were in the
group being called lousy coders and you didn't like it. But now I see
you were talking about other people in the nasty way shown all over
this thread.

I don't know if you will fit in the Ruby community because we are
generally a nice bunch of people and your behavior on this thread
hasn't been very nice. In case the life lesson hasn't been taught to
you yet, you attract more flies with honey than with vinegar, so to
speak.

Regards,
Ryan
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-16 05:45
(Received via mailing list)
Ryan Leavengood wrote:
> I don't know if you will fit in the Ruby community because we are
> generally a nice bunch of people and your behavior on this thread
> hasn't been very nice. In case the life lesson hasn't been taught to
> you yet, you attract more flies with honey than with vinegar, so to
> speak.

I never called out anyone in particular, didn't have anyone in mind in
fact, but it's funny how some people are insecure enough that they're
getting offended by what I wrote, and getting so defensive that they
feel the need to lash out in return.

It reminds me of a saying: "If you throw a rock into a pack of dogs, the
dog that yelps loudest is the one that got hit."

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
31e038e4e9330f6c75ccfd1fca8010ee?d=identicon&s=25 Gregory Brown (Guest)
on 2005-12-16 05:48
(Received via mailing list)
On 12/15/05, Neil Stevens <neil@hakubi.us> wrote:

>
> It reminds me of a saying: "If you throw a rock into a pack of dogs, the
> dog that yelps loudest is the one that got hit."

That's the point you missed.  Rubyists don't throw rocks at dogs.
C67cd5ddac08f7fd35dad888e3bf5182?d=identicon&s=25 Adam Sroka (Guest)
on 2005-12-16 05:51
(Received via mailing list)
Gregory Brown wrote:
> On 12/15/05, Neil Stevens <neil@hakubi.us> wrote:
>
>
>> It reminds me of a saying: "If you throw a rock into a pack of dogs, the
>> dog that yelps loudest is the one that got hit."
>>
>
> That's the point you missed.  Rubyists don't throw rocks at dogs.
>
That's just 'cause Ruby rocks are too expensive to throw ;-)
3c155ef399326d533efc2eb91ac992e5?d=identicon&s=25 Neil Stevens (Guest)
on 2005-12-16 05:54
(Received via mailing list)
Gregory Brown wrote:
> On 12/15/05, Neil Stevens <neil@hakubi.us> wrote:
>
>
>>It reminds me of a saying: "If you throw a rock into a pack of dogs, the
>>dog that yelps loudest is the one that got hit."
>
>
> That's the point you missed.  Rubyists don't throw rocks at dogs.

Heh, some humor was perfect for this point in the thread.  Very nice.

--
Neil Stevens - neil@hakubi.us

'A republic, if you can keep it.' -- Benjamin Franklin
F0223b1193ecc3a935ce41a1edd72e42?d=identicon&s=25 Zach Dennis (Guest)
on 2005-12-16 05:54
(Received via mailing list)
Jeff Wood wrote:

> Quite simply, Ruby is *supposed* to be about consistency ... Having the
> "everything is an object, principal of least surprise" mantra, then
> using these which act like a global ( $ ) but aren't actually ( local
> scope ) is just vile.

You have made a common mistake, you are thinking Ruby is meant for your
principle of least surprise, when it is matz's least surprise...

Quoting Matz,

"Besides that, he doesn't understand what POLS means.  Since someone
will surprise for any arbitrary choice, it is impossible to satisfy
"least surprise" in his sense.  The truth is two folds: a) when there
are two or more choices in the language design decision, I take the
one that makes _me_ surprise least. "

http://groups.google.com/group/comp.lang.ruby/msg/...

Zach
4b174722d1b1a4bbd9672e1ab50c30a9?d=identicon&s=25 Ryan Leavengood (Guest)
on 2005-12-16 06:00
(Received via mailing list)
On 12/15/05, Neil Stevens <neil@hakubi.us> wrote:
>
> I never called out anyone in particular, didn't have anyone in mind in
> fact, but it's funny how some people are insecure enough that they're
> getting offended by what I wrote, and getting so defensive that they
> feel the need to lash out in return.

I'm not sure if anyone lashed out really, just pointed out your
rudeness for your own good. You see, by writing the way you have here,
you have sewn bad will, and any further communication from you will
automatically be tainted by your previous behavior. Your opinion will
automatically be deemed less important than most other people. I'm
sure you don't care, in your infinite wisdom and unfailing purity of
perfect thought, but I figured you should know.

> It reminds me of a saying: "If you throw a rock into a pack of dogs, the
> dog that yelps loudest is the one that got hit."

That's lovely, LOL.

Ryan
C67cd5ddac08f7fd35dad888e3bf5182?d=identicon&s=25 Adam Sroka (Guest)
on 2005-12-16 06:18
(Received via mailing list)
Ryan Leavengood wrote:
> On 12/15/05, Neil Stevens <neil@hakubi.us> wrote:
>
<<bunch of snipped stuff about rudeness and such>>

I'm kind of new here, but it seems like debating the quality of one
another's rudeness is slightly OT.

Incidentally, I earned my living as a back-end Perl programmer for a
couple years early in my career. I wrote literate OO code in Perl. The
thing I loved about Perl was that, like a natural language, there were
several ways to say the same thing some obvious, some flowery, some a
bit too terse. Among the things I did *not* like about Perl were some of
the whacky implicit variables like $1, etc, and especially @_. However,
I never needed to use them explicitly, because when I wanted them they
were always there implicitly (Those who know Perl know what I am
saying.)

My point is this: I am (or was) a Perl programmer and I do not like the
"Perlish" syntax one bit. I didn't like it in Perl, and I don't like it
in Ruby either. However, I object to the notion that Perlishness makes
it offensive. Non-obviousness is what makes it offensive. The matcher
syntax is much clearer.
C67cd5ddac08f7fd35dad888e3bf5182?d=identicon&s=25 Adam Sroka (Guest)
on 2005-12-16 06:27
(Received via mailing list)
Adam Sroka wrote:
> The matcher syntax...
I meant MatchData. Matcher is a Java thing... <<administers self
flogging>>
4b174722d1b1a4bbd9672e1ab50c30a9?d=identicon&s=25 Ryan Leavengood (Guest)
on 2005-12-16 06:36
(Received via mailing list)
On 12/16/05, Adam Sroka <adam.s@covad.net> wrote:
>
> I'm kind of new here, but it seems like debating the quality of one
> another's rudeness is slightly OT.

You are right of course, plus I was starting to get hypocritical and
be rude myself.

> My point is this: I am (or was) a Perl programmer and I do not like the
> "Perlish" syntax one bit. I didn't like it in Perl, and I don't like it
> in Ruby either. However, I object to the notion that Perlishness makes
> it offensive. Non-obviousness is what makes it offensive. The matcher
> syntax is much clearer.

I had a strong dislike for Perl after some bad experiences at one job,
but for some reason the "Perlisms" in Ruby don't bother me as much,
because as a whole Ruby tends to be so much more readable. Like
anything, I think there are times when using the $1 variables is
appropriate, and times when using MatchData is appropriate. To say one
or the other is the only proper way to code Ruby regular expression
matching is getting a little too pedantic.

In the same way I can't control perceived mailing list "rudeness", I
don't think the people in the MatchData camp can control how other
people code (no matter how passionate they are about the topic.)
Especially on a section of Ruby code style that is not really debated
(whereas lots of people will denounce camelCaseMethods etc.)

Anyhow, even this discussion is off-topic for the original post, so
I'm stopping here.

Ryan
2cf6d8e639314abd751f83a72e9a2ac5?d=identicon&s=25 Martin DeMello (Guest)
on 2005-12-16 11:10
(Received via mailing list)
Gregory Brown <gregory.t.brown@gmail.com> wrote:
> On 12/15/05, Neil Stevens <neil@hakubi.us> wrote:
>
> >
> > It reminds me of a saying: "If you throw a rock into a pack of dogs, the
> > dog that yelps loudest is the one that got hit."
>
> That's the point you missed.  Rubyists don't throw rocks at dogs.

Nicely put! :)

martin

Not throwing rocks at dogs since 2000
5befe95e6648daec3dd5728cd36602d0?d=identicon&s=25 Robert Klemme (Guest)
on 2005-12-19 22:48
(Received via mailing list)
ako... <akonsu@gmail.com> wrote:
> well, i did not realise that the term "group's captures" is that rare.
>            Match m = r.Match(text);
>                }
>        capture: [red]
> it just must be as powerful as .net ; -)
>
> konstantin

I don't know whether your question was answered in the lengthy thread
already.  In case not: in Ruby to get all matches of a group you need to
iterate through the whole string with #scan.  There is no such thing as
this
feature of .net - and frankly I haven't missed it so far.  To get at all
the
words in your example this is sufficient:

>> s = "One car red car blue car"
=> "One car red car blue car"
>> s.scan /\w+/
=> ["One", "car", "red", "car", "blue", "car"]

If you actually need group matches, you'll have to do something like
this

>> s.scan(/\w(\w+)/).map{|m| m[0]}
=> ["ne", "ar", "ed", "ar", "lue", "ar"]

alternative

>> ma=[]
=> []
>> s.scan(/\w(\w+)/) {|m| ma << m[0]}
=> "One car red car blue car"
>> ma
=> ["ne", "ar", "ed", "ar", "lue", "ar"]

Of course this is quite a silly example...  The main point here is that
you
must refrain from anchoring the regexp at the beginning if you want to
iterate like this.

HTH

    robert
This topic is locked and can not be replied to.