Simple Regexp help

joe-black · January 7, 2009, 12:50am

How can I test a word to make sure it ONLY contains certain characters?

Say i have an expression like /[A-Z]/i

How could i have “Testing” pass but “Testing123” fail?

PS. I can not dynamically create the expression so it looks like this
/[A-Z{10}]/

joe-black · January 7, 2009, 2:01am

Joe B. wrote:

How can I test a word to make sure it ONLY contains certain
characters?

Say i have an expression like /[A-Z]/i

How could i have “Testing” pass but “Testing123” fail?

PS. I can not dynamically create the expression so it looks like this
/[A-Z{10}]/

If you only want a-z, set the character class to start and end the
string, otherwise it’ll match anything with a character that’s in the
alphabet, regardless of what follows it. I.e., /^[a-z]+$/i will only
match a-z characters from start to end. You can then change the
character class to whatever you wish. ^ is the start of the string and
$ is the end of the string. + is one or more characters, so ^[a-z]+$ is
one of more characters in the character class [], being a-z, and noting
else.

joe-black · January 7, 2009, 2:17am

Tim G. wrote:

Joe B. wrote:

How can I test a word to make sure it ONLY contains certain
characters?

Say i have an expression like /[A-Z]/i

How could i have “Testing” pass but “Testing123” fail?

PS. I can not dynamically create the expression so it looks like this
/[A-Z{10}]/

If you only want a-z, set the character class to start and end the
string, otherwise it’ll match anything with a character that’s in the
alphabet, regardless of what follows it. (…)

You can also test for the opposite, anything not in the range a-z.

class String
def all_letters?
(self =~ /[^a-z]/i).nil?
end
end

puts “Testing”.all_letters?

=> true

puts “Tes34ting”.all_letters?

=> false

hth,

Siep

joe-black · January 7, 2009, 6:00pm

Robert thanks \A and \z is what I was looking for.

Can you explain though what that does though? I can not seem to find any
relevant information on it.

joe-black · January 7, 2009, 6:30pm

On Wed, Jan 7, 2009 at 5:59 PM, Joe B. [email protected] wrote:

Robert thanks \A and \z is what I was looking for.

Can you explain though what that does though? I can not seem to find any
relevant information on it.

\A matches the beggining of the string and \z matches the end of the
string.
If you don’t “anchor” the regexp with those, what you match can match
anywhere within the string, even a substring. So for example:

irb(main):001:0> a = “abcdef”
=> “abcdef”
irb(main):005:0> a =~ /bc/
=> 1

This means that the regexp /bc/ has matched the string at position
one. Notice now if you anchor it that it doesn’t match:

irb(main):006:0> a =~ /\Abc\z/
=> nil
irb(main):007:0> a =~ /\Aabcdef\z/
=> 0

www.rubular.com has a regular expression editor, where you can test,
and also a quick reference guide and a link to an online copy of the
pickaxe, although to be honest, not much is said about \A and \z
specifically.

You can also search here for more info about regexps:

Hope this helps,

Jesus.

joe-black · January 7, 2009, 7:15pm

Robert K. wrote:

irb(main):001:0> s = “foo\n”
=> “foo\n”
irb(main):002:0> /^[a-z]+$/i =~ s
=> 0

Not knowing exactly what the OP wanted, it was only a quick example, so
what is “safe” is relative. Anyway, the use of \A and \z is indeed
probably a better example regardless of not knowing exactly what they
want.

joe-black · January 7, 2009, 8:45am

On 07.01.2009 01:17, Tim G. wrote:

/[A-Z{10}]/

If you only want a-z, set the character class to start and end the
string, otherwise it’ll match anything with a character that’s in the
alphabet, regardless of what follows it. I.e., /^[a-z]+$/i will only
match a-z characters from start to end.

Note, this is not totally safe:

irb(main):001:0> s = “foo\n”
=> “foo\n”
irb(main):002:0> /^[a-z]+$/i =~ s
=> 0

As you can see, even a string with a newline at the end passes. This
version is safer because anchors do actually use start and end of the
string:

irb(main):003:0> /\A[a-z]+\z/i =~ s
=> nil

You can then change the
character class to whatever you wish. ^ is the start of the string and
$ is the end of the string.

Actually ^ is a line start and $ is a line end (see above). For proper
anchoring string start and end you need \A and \z.

is one or more characters, so ^[a-z]+$ is
one of more characters in the character class [], being a-z, and noting
else.

Note also that, depending on definition of legal string the expression
should probably contain * instead of + because also the empty string
does not contain any illegal characters.

Kind regards

robert