Forum: Ruby Regular expressions

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
A00673345921ae8c2e5570d1bd48b2e2?d=identicon&s=25 J. mp (lerias)
on 2007-02-11 23:58
Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid

let me know the reg exp to do this
3afd3e5e05dc9310c89aa5762cc8dd1d?d=identicon&s=25 Timothy Hunter (Guest)
on 2007-02-12 00:04
(Received via mailing list)
J. mp wrote:
> let me know the reg exp to do this
>
>
I think more details are necessary. What characters are allowed in
"username"? Just alphabetic? Alphabetic+numbers? Anything else? Is there
a minimum number of characters? A maximum? Just Latin characters?
Similarly for "user.name" Is it the same as "username" except with a
period? Does there have to be exactly four characters before the period
and four after it? Any other constraints?

To use regular expressions you must be able to precisely state what a
"match" means.
Ea24c17719a975fb38c107a60f4b3802?d=identicon&s=25 Vincent Fourmond (Guest)
on 2007-02-12 00:08
(Received via mailing list)
J. mp wrote:
> Hi folks,
> I'm burning my head because i don't understand how regular expressions
> works
>
> I just want to validade a username wher
> username ->valid
> user.name ->valid
>
> everything else is invalid

  Just to get you started:

  /[a-z]+(\.[a-z]+)?/

  Vince
A00673345921ae8c2e5570d1bd48b2e2?d=identicon&s=25 J. mp (lerias)
on 2007-02-12 00:21
Vincent Fourmond wrote:
> J. mp wrote:
>> Hi folks,
>> I'm burning my head because i don't understand how regular expressions
>> works
>>
>> I just want to validade a username wher
>> username ->valid
>> user.name ->valid
>>
>> everything else is invalid
>
>   Just to get you started:
>
>   /[a-z]+(\.[a-z]+)?/
>
>   Vince


First of all, thanks for the attention.
More details:

max size allowed is 30
min size allowed is 5

the follwoing chars are allowed :
- _ . (Slash, undescore, perdiod)

these chars are not allowed as start neither as ending char

any alphabetic char, english chars only
case insensitive
no numbers

eg
.username ->invalid
user-name -> valid
_username -> invalid
user.name -> valid
user_name ->valid

basically I want allow the same pattern allowed for emails but before
the @ char :)
3afd3e5e05dc9310c89aa5762cc8dd1d?d=identicon&s=25 Timothy Hunter (Guest)
on 2007-02-12 00:57
(Received via mailing list)
J. mp wrote:
>>> user.name ->valid
>
>
>
>

/\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/

> basically I want allow the same pattern allowed for emails but before
> the @ char :)
>
>
The above regexp does not do this. Certainly you can have numbers in
your email address, for example. Basically anything is allowed before
the @. Google "regular expression email address" for extensive
discussions about this.
A00673345921ae8c2e5570d1bd48b2e2?d=identicon&s=25 J. mp (lerias)
on 2007-02-12 01:04
Timothy Hunter wrote:
> J. mp wrote:
>>>> user.name ->valid
>>
>>
>>
>>
>
> /\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/
>
>> basically I want allow the same pattern allowed for emails but before
>> the @ char :)
>>
>>
> The above regexp does not do this. Certainly you can have numbers in
> your email address, for example. Basically anything is allowed before
> the @. Google "regular expression email address" for extensive
> discussions about this.

It works well.
 Thanks a lot
852a62a28f1de229dc861ce903b07a60?d=identicon&s=25 Gavin Kistner (phrogz)
on 2007-02-12 07:10
(Received via mailing list)
On Feb 11, 4:21 pm, "J. mp" <joaomiguel.pere...@gmail.com> wrote:
> max size allowed is 30
> min size allowed is 5
>
> the follwoing chars are allowed :
> - _ . (Slash, undescore, perdiod)
>
> these chars are not allowed as start neither as ending char

/\A[a-z][a-z.-]{3,28}[a-z]\Z/i


Translated, that says:
* start at the beginning
* find any letter
* followed by 3-28 characters that are letters, periods, or hyphens
* followed by a letter
* follwed by the
* oh, and be case insensitive, please

Note that, per your exact instructions, this allows:
  u_s_e_r_n_a_m_e
  u____________________________e
  z._-_.z
A00673345921ae8c2e5570d1bd48b2e2?d=identicon&s=25 J. mp (lerias)
on 2007-02-12 11:49
Gavin Kistner wrote:
> On Feb 11, 4:21 pm, "J. mp" <joaomiguel.pere...@gmail.com> wrote:
>> max size allowed is 30
>> min size allowed is 5
>>
>> the follwoing chars are allowed :
>> - _ . (Slash, undescore, perdiod)
>>
>> these chars are not allowed as start neither as ending char
>
> /\A[a-z][a-z.-]{3,28}[a-z]\Z/i
>
>
> Translated, that says:
> * start at the beginning
> * find any letter
> * followed by 3-28 characters that are letters, periods, or hyphens
> * followed by a letter
> * follwed by the
> * oh, and be case insensitive, please
>
> Note that, per your exact instructions, this allows:
>   u_s_e_r_n_a_m_e
>   u____________________________e
>   z._-_.z


Oh damm!! the first should  be allowed but second and the third should
not be allowed
thnaks
753dcb78b3a3651127665da4bed3c782?d=identicon&s=25 Brian Candler (Guest)
on 2007-02-12 12:27
(Received via mailing list)
On Mon, Feb 12, 2007 at 08:56:12AM +0900, Timothy Hunter wrote:
> /\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/
>
> >basically I want allow the same pattern allowed for emails but before
> >the @ char :)
> >
> >
> The above regexp does not do this. Certainly you can have numbers in
> your email address, for example. Basically anything is allowed before
> the @. Google "regular expression email address" for extensive
> discussions about this.

And if you are being pedantic, RFC2822 doesn't allow E-mail addresses to
contain two dots next to each other, unless the local-part is quoted.
A00673345921ae8c2e5570d1bd48b2e2?d=identicon&s=25 J. mp (lerias)
on 2007-02-12 12:38
Brian Candler wrote:
> On Mon, Feb 12, 2007 at 08:56:12AM +0900, Timothy Hunter wrote:
>> /\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/
>>
>> >basically I want allow the same pattern allowed for emails but before
>> >the @ char :)
>> >
>> >
>> The above regexp does not do this. Certainly you can have numbers in
>> your email address, for example. Basically anything is allowed before
>> the @. Google "regular expression email address" for extensive
>> discussions about this.
>
> And if you are being pedantic, RFC2822 doesn't allow E-mail addresses to
> contain two dots next to each other, unless the local-part is quoted.

Ok, thanks all, I need a reg expr do what I described before without the
dots, slashes and underscores one after another, and not in the start
nor in the end
Thnaks
Ae16cb4f6d78e485b04ce1e821592ae5?d=identicon&s=25 Martin DeMello (Guest)
on 2007-02-12 12:52
(Received via mailing list)
On 2/12/07, J. mp <joaomiguel.pereira@gmail.com> wrote:
> >
> > Note that, per your exact instructions, this allows:
> >   u_s_e_r_n_a_m_e
> >   u____________________________e
> >   z._-_.z
>
>
> Oh damm!! the first should  be allowed but second and the third should
> not be allowed

The regexp could be extended to allow this, but it gets ever more
convoluted and unreadable - you'd be better off doing a separate check
for a !~ /[^A-Za-z]{2,}/ (that is, "a does not match two
non-alphanumeric chars in a row"

>> tests = %w( u_s_e_r_n_a_m_e
 u____________________________e
 z._-_.z
)
=> ["u_s_e_r_n_a_m_e", "u____________________________e", "z._-_.z"]
>> tests.each {|a| p [a,  a !~ /[^A-Za-z]{2,}/]}
["u_s_e_r_n_a_m_e", true]
["u____________________________e", false]
["z._-_.z", false]

Also, play around with http://weitz.de/regex-coach/

martin
852a62a28f1de229dc861ce903b07a60?d=identicon&s=25 Gavin Kistner (phrogz)
on 2007-02-12 16:10
(Received via mailing list)
On Feb 12, 3:49 am, "J. mp" <joaomiguel.pere...@gmail.com> wrote:
> Gavin Kistner wrote:
> > /\A[a-z][a-z.-]{3,28}[a-z]\Z/i
[snip]
> > Note that, per your exact instructions, this allows:
> >   u_s_e_r_n_a_m_e
> >   u____________________________e
> >   z._-_.z
>
> Oh damm!! the first should  be allowed but second and the third should
> not be allowed

OK, but *why* aren't they allowed. You haven't described exactly what
your requirements are. Is it because you can't have to non-letters in
a row? Is it because the string must contain at least three letters?

BTW, where are these requirements coming from? Are these business
requirements that must be enforced? Are you just making up what you
think people should probably have to use as a name? Or are you just
trying to learn regexp?
A00673345921ae8c2e5570d1bd48b2e2?d=identicon&s=25 J. mp (lerias)
on 2007-02-12 16:25
Gavin Kistner wrote:
> On Feb 12, 3:49 am, "J. mp" <joaomiguel.pere...@gmail.com> wrote:
>> Gavin Kistner wrote:
>> > /\A[a-z][a-z.-]{3,28}[a-z]\Z/i
> [snip]
>> > Note that, per your exact instructions, this allows:
>> >   u_s_e_r_n_a_m_e
>> >   u____________________________e
>> >   z._-_.z
>>
>> Oh damm!! the first should  be allowed but second and the third should
>> not be allowed
>
> OK, but *why* aren't they allowed. You haven't described exactly what
> your requirements are. Is it because you can't have to non-letters in
> a row? Is it because the string must contain at least three letters?
>
> BTW, where are these requirements coming from? Are these business
> requirements that must be enforced? Are you just making up what you
> think people should probably have to use as a name? Or are you just
> trying to learn regexp?

It's a business requirement. The user name will be used before the
domain, for example:
I have the domain http://somedomain.com and for each user a unique url
will exists like http://user.name.somedomain.com
http://david_coperfield.somedomain.com
http://andreas-blast.somedomain.com

This is my business requirement, so I can only allow user names that can
be used in a URI.

Thnaks all again,
852a62a28f1de229dc861ce903b07a60?d=identicon&s=25 Gavin Kistner (phrogz)
on 2007-02-12 18:01
(Received via mailing list)
On Feb 12, 8:25 am, "J. mp" <joaomiguel.pere...@gmail.com> wrote:
> Gavin Kistner wrote:
> > OK, but *why* aren't they allowed. You haven't described exactly what
> > your requirements are. Is it because you can't have to non-letters in
> > a row? Is it because the string must contain at least three letters?

You didn't answer these questions.

> http://andreas-blast.somedomain.com
>
> This is my business requirement, so I can only allow user names that can
> be used in a URI.

So the question is, what is legal in that part of a URI? The best
resource I can find is RFC2396 [1], and it says:
"The most common name registry mechanism is the Domain Name System
(DNS). A registered name intended for lookup in the DNS uses the
syntax defined in Section 3.5 of [RFC1034] and Section 2.1 of
[RFC1123]."


Section 2.1 of RFC 1123 [2] says:
"The syntax of a legal Internet host name was specified in RFC-952
[DNS:4].  One aspect of host name syntax is hereby changed: the
restriction on the first character is relaxed to allow either a letter
or a digit.  Host software MUST support this more liberal syntax.

Host software MUST handle host names of up to 63 characters and SHOULD
handle host names of up to 255 characters."


RFC 952 [3] says:
"<domainname> ::= <hname>
<hname> ::= <name>*["."<name>]
<name>  ::= <let>[*[<let-or-digit-or-hyphen>]<let-or-digit>]"


So, my reading of that (and I'm not an expert) is that a machine name
MAY have digits in it (including at the start or end), may NOT have
underscores, and may be pretty darn long. (Though it makes sense to
put some sort of bound on it - if you think 30 chars is OK, so be it.)

A regexp for this, allowing multiple dotted names joined together:

# Regexp for a single name
/[a-z\d](?:[a-z\d-]*[a-z\d])?/i

# Regexp for 1 or more of those joined by periods
/(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*/i


[1] http://www.gbiv.com/protocols/uri/rfc/rfc3986.html
[2] http://rfc-ref.org/RFC-TEXTS/1123/chapter2.html#sub1
[3] http://rfc.net/rfc952.html#sA.
A00673345921ae8c2e5570d1bd48b2e2?d=identicon&s=25 J. mp (lerias)
on 2007-02-12 18:22
Gavin Kistner wrote:
> On Feb 12, 8:25 am, "J. mp" <joaomiguel.pere...@gmail.com> wrote:
>> Gavin Kistner wrote:
>> > OK, but *why* aren't they allowed. You haven't described exactly what

> So, my reading of that (and I'm not an expert) is that a machine name
> MAY have digits in it (including at the start or end), may NOT have
> underscores, and may be pretty darn long. (Though it makes sense to
> put some sort of bound on it - if you think 30 chars is OK, so be it.)
>
> A regexp for this, allowing multiple dotted names joined together:
>
> # Regexp for a single name
> /[a-z\d](?:[a-z\d-]*[a-z\d])?/i
>
> # Regexp for 1 or more of those joined by periods
> /(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*/i
>
>
> [1] http://www.gbiv.com/protocols/uri/rfc/rfc3986.html
> [2] http://rfc-ref.org/RFC-TEXTS/1123/chapter2.html#sub1
> [3] http://rfc.net/rfc952.html#sA.
So, Gavin your last regex allows only valid host names on an URI? I'm
sorry for not reading the RFC before. My requirement is what I said, the
user name will act as part of an URI, so I should allow any combination
of chars that are valid for the first part of an URI
852a62a28f1de229dc861ce903b07a60?d=identicon&s=25 Gavin Kistner (phrogz)
on 2007-02-12 20:05
(Received via mailing list)
On Feb 12, 10:22 am, "J. mp" <joaomiguel.pere...@gmail.com> wrote:
> sorry for not reading the RFC before. My requirement is what I said, the
> user name will act as part of an URI, so I should allow any combination
> of chars that are valid for the first part of an URI

I think so. I haven't tested it. Actually, I see one minor mistake -
to be safe, anchor this regexp to the start/end to ensure you're
matching exactly what the user entered:
/\A(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*
\Z/i

To be clear, it will match:
f
9
274
3cats7
a.b
a-b
foo
foo-bar
foo.bar
foo.bar.jim
foo-bar.jim-jam
spoofy.com.edu.gov.com.com
crazy.long.name.because.the.regexp.has.no.limits.on.it.whatsoever.for.length

And it will reject:
-foo
foo-
foo.
.foo
foo_bar
56f2ce19706d05d18b5b66483aa13f98?d=identicon&s=25 Lloyd Zusman (Guest)
on 2007-02-24 13:32
(Received via mailing list)
I'm trying to install ruby-1.8.5-p12 on my CentOS4 system.

  % uname -rsvp
  % Linux 2.6.9-022stab078.23-enterprise #1 SMP Thu Oct 19 14:54:39 MSD
2006 i686

I do the following steps with no problem:

  ./configure --prefix=/usr --enable-shared
  make

But then, when I do "make check", the test run hangs forever after a few
hundred characters (mostly dots) print out.

When I finally do a Control-C, I get the following stack trace:

  /readline/test_readline.rb:20:in `readline': Interrupt
  from ./readline/test_readline.rb:20:in `test_readline'
  from ./readline/test_readline.rb:72:in `replace_stdio'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in
`open_uri_original_open'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open'
  from ./readline/test_readline.rb:66:in `replace_stdio'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in
`open_uri_original_open'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open'
  from ./readline/test_readline.rb:65:in `replace_stdio'
   ... 15 levels...
  from
/usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/ui/testrunnerutilities.rb:29:in
`run'
  from
/usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/autorunner.rb:200:in
`run'
  from
/usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/autorunner.rb:13:in
`run'
  from runner.rb:7
  make: *** [test-all] Error 1

If I'm reading this correctly, it looks like open_uri_original_open is
somehow being called recursively and repeatedly failing.

Is this something I need to be concerned about?

Thanks.
3bb23e7770680ea44a2d79e6d10daaed?d=identicon&s=25 M. Edward (Ed) Borasky (Guest)
on 2007-02-24 19:01
(Received via mailing list)
Lloyd Zusman wrote:
> But then, when I do "make check", the test run hangs forever after a few
>   from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open_uri_original_open'
> somehow being called recursively and repeatedly failing.
>
> Is this something I need to be concerned about?
>
> Thanks.
>
>
>
You may have to do "make install" before "make check". Did you do it
that way?

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given
rabbits fire.
56f2ce19706d05d18b5b66483aa13f98?d=identicon&s=25 Lloyd Zusman (Guest)
on 2007-02-25 00:31
(Received via mailing list)
"M. Edward (Ed) Borasky" <znmeb@cesmail.net> writes:

> that way?
No, I didn't.  I have always thought that the autoconf convention is to
perform "make check" first, and to use its result to decide whether to
then do the "make install".  It seems incorrect to install the software
and then check it.  If "make check" fails miserably, it will be a big
headache to then try to uninstall everything.

But I took a chance, and I did do the "make install" after all.  Then, I
did a "make check", and it hung in exactly the same manner as before.
Luckily, ruby seems to work for all of my usual scripts, but I don't
know whether there might be something fundamentally wrong which will
bite me later.
3bb23e7770680ea44a2d79e6d10daaed?d=identicon&s=25 M. Edward (Ed) Borasky (Guest)
on 2007-02-25 04:48
(Received via mailing list)
Lloyd Zusman wrote:
>>> [ ... ]
> headache to then try to uninstall everything.
>
Yeah ... that's the way it's supposed to work -- check first, then
install. But I usually make a new home in /opt for testing stuff anyhow,
rather than letting it default into /usr/local. And I have had instances
where things broke in "make check" that didn't break after "make
install" because of some path issues. I'll take this as encouragement to
hunt them down and document them. :)

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given
rabbits fire.
56f2ce19706d05d18b5b66483aa13f98?d=identicon&s=25 Lloyd Zusman (Guest)
on 2007-02-25 14:56
(Received via mailing list)
"M. Edward (Ed) Borasky" <znmeb@cesmail.net> writes:

>> then do the "make install".  [ ... ]
>>
> Yeah ... that's the way it's supposed to work -- check first, then
> install. But I usually make a new home in /opt for testing stuff anyhow,
> rather than letting it default into /usr/local. And I have had instances
> where things broke in "make check" that didn't break after "make
> install" because of some path issues. I'll take this as encouragement to
> hunt them down and document them. :)

Yes, I see how that approach can be helpful.  I just usually do the
following: if "make check" fails, don't even try the install.  I've
never seen a case where the check succeeded and the software blew up
after installation ... although I know that this certainly could happen,
and your procedure would catch this problem.

In any case, this time I got the same error during the "make check" both
before and after the installation.
This topic is locked and can not be replied to.