Forum: Ruby Email Address Regex

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Cff9eed5d8099e4c2d34eae663aae87e?d=identicon&s=25 Jacob Fugal (Guest)
on 2006-01-04 00:45
(Received via mailing list)
On 1/3/06, Dan Kohn <dan@dankohn.com> wrote:
> Here's a rails example for validating email addresses.
>
>   validates_format_of :login, :with => /
>     ^[-^!$#%&'*+\/=?`{|}~.\w]+
>     @[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*
>     (\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*)+$/x,
>     :message => "must be a valid email address",
>     :on => :create

Be careful with email validation via regex, it's harder than you might
think[1][2]:

/^([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\x0C\x0E-\x21\x23-\x5B\x5D
-\x7F]|\\[\x00-\x7F])*")(\.([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\
x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"))*@([a-zA-Z0-9&_?\/`!
|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])(\.
([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[
\x00-\x7F])*\]))*$/

Jacob Fugal

[1] From
http://phantom.byu.edu/pipermail/uug-list/2004-Jan...
[2] That regex needs some serious /x treatment, which I didn't know
about at the time it was written.
64257a2a5b9673e519db55dbd4bd9d42?d=identicon&s=25 Tim Fletcher (Guest)
on 2006-01-04 11:24
(Received via mailing list)
http://tfletcher.com/lib/rfc822.rb

(doesn't look quite as messy :)
1fba4539b6cafe2e60a2916fa184fc2f?d=identicon&s=25 unknown (Guest)
on 2006-01-04 13:15
(Received via mailing list)
Hi --

On Wed, 4 Jan 2006, Jacob Fugal wrote:

> Be careful with email validation via regex, it's harder than you might
> think[1][2]:
>
> /^([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\x0C\x0E-\x21\x23-\x5B\x5D
> -\x7F]|\\[\x00-\x7F])*")(\.([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\
> x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"))*@([a-zA-Z0-9&_?\/`!
> |#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])(\.
> ([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[
> \x00-\x7F])*\]))*$/

See also: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html


David

--
David A. Black
dblack@wobblini.net

"Ruby for Rails", from Manning Publications, coming April 2006!
http://www.manning.com/books/black
3a83969376c805ef5b6042191fdb0ff3?d=identicon&s=25 Andreas S. (andreas)
on 2006-01-04 13:47
Jacob Fugal wrote:
> On 1/3/06, Dan Kohn <dan@dankohn.com> wrote:
>> Here's a rails example for validating email addresses.
>>
>>   validates_format_of :login, :with => /
>>     ^[-^!$#%&'*+\/=?`{|}~.\w]+
>>     @[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*
>>     (\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*)+$/x,
>>     :message => "must be a valid email address",
>>     :on => :create
>
> Be careful with email validation via regex, it's harder than you might
> think[1][2]:
>
> /^([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\x0C\x0E-\x21\x23-\x5B\x5D
> -\x7F]|\\[\x00-\x7F])*")(\.([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\
> x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"))*@([a-zA-Z0-9&_?\/`!
> |#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])(\.
> ([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[
> \x00-\x7F])*\]))*$/

It is trivial to create a formally correct address that makes absolutely
no sense, so what's the point of doing such a complicated and
error-prone validation?
31af45939fec7e3c4ed8a798c0bd9b1a?d=identicon&s=25 Matthew Smillie (Guest)
on 2006-01-04 14:16
(Received via mailing list)
On Jan 4, 2006, at 12:47, Andreas S. wrote:

>>
>> |#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])
>> (\.
>> ([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|
>> \\[
>> \x00-\x7F])*\]))*$/
>
> It is trivial to create a formally correct address that makes
> absolutely
> no sense, so what's the point of doing such a complicated and
> error-prone validation?

Job security?  I mean, without pointer arithmetic and its associated
mysteries (negative array indices were a personal favourite), we need
something to keep us gainfully employed!

matthew smillie.
64257a2a5b9673e519db55dbd4bd9d42?d=identicon&s=25 Tim Fletcher (Guest)
on 2006-01-04 18:12
(Received via mailing list)
By "error prone" do you mean that it won't detect addresses that don't
exist?

Is it not still better to catch some errors than none at all?
Cff9eed5d8099e4c2d34eae663aae87e?d=identicon&s=25 Jacob Fugal (Guest)
on 2006-01-04 18:13
(Received via mailing list)
On 1/4/06, Tim Fletcher <twoggle@gmail.com> wrote:
> http://tfletcher.com/lib/rfc822.rb
>
> (doesn't look quite as messy :)

Yeah, as I said in the footnote, the regex I posted needed some
readability treatment. Yours looks pretty nice, and exactly equivalent
except for a typo in quoted_pair:

 - quoted_pair = '\\x5c\\x00-\\x7f'
 + quoted_pair = '\\x5c[\\x00-\\x7f]'

Jacob Fugal
3a83969376c805ef5b6042191fdb0ff3?d=identicon&s=25 Andreas S. (andreas)
on 2006-01-04 18:17
Tim Fletcher wrote:
> By "error prone" do you mean that it won't detect addresses that don't
> exist?

No, I mean that it might declare some addresses invalid although they
aren't.
Cff9eed5d8099e4c2d34eae663aae87e?d=identicon&s=25 Jacob Fugal (Guest)
on 2006-01-04 18:17
(Received via mailing list)
On 1/4/06, dblack@wobblini.net <dblack@wobblini.net> wrote:
> See also: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

Yeah, I've seen that one as well. My regex is only meant to match the
definition of an 'addr-spec' token (described as "global" or "simple"
address) in section 6.1 of the RFC822 grammar, as opposed to a
'mailbox' or 'address'. I figure people aren't going to type the "John
Doe <john@doe.com>" format into a form, nor named lists ('group' token
in the grammar).

Jacob Fugal
Cff9eed5d8099e4c2d34eae663aae87e?d=identicon&s=25 Jacob Fugal (Guest)
on 2006-01-04 18:26
(Received via mailing list)
On 1/4/06, Andreas S. <f@andreas-s.net> wrote:
> Tim Fletcher wrote:
> > By "error prone" do you mean that it won't detect addresses that don't
> > exist?
>
> No, I mean that it might declare some addresses invalid although they
> aren't.

You'll see from my comments in the original post[1] and in my reply to
David Black in the other thread[2] that this regex is indeed compliant
with a single, non-named address as defined by the RFC[3].

Jacob Fugal

[1] http://phantom.byu.edu/pipermail/uug-list/2004-Jan...
[2] [ruby-talk:174081]
[3] http://www.faqs.org/rfcs/rfc822.html
3a83969376c805ef5b6042191fdb0ff3?d=identicon&s=25 Andreas S. (andreas)
on 2006-01-04 18:56
Jacob Fugal wrote:
> On 1/4/06, Andreas S. <f@andreas-s.net> wrote:
>> Tim Fletcher wrote:
>> > By "error prone" do you mean that it won't detect addresses that don't
>> > exist?
>>
>> No, I mean that it might declare some addresses invalid although they
>> aren't.
>
> You'll see from my comments in the original post[1] and in my reply to
> David Black in the other thread[2] that this regex is indeed compliant
> with a single, non-named address as defined by the RFC[3].

Possibly. Still, I prefer a simple solution over a complicated one. What
type of errors do you hope to catch with this huge regex? Typing errors?
Deliberately entered rubbish? The regex accepts just about anything with
a "@", e.g. "$@$".
Cff9eed5d8099e4c2d34eae663aae87e?d=identicon&s=25 Jacob Fugal (Guest)
on 2006-01-04 19:21
(Received via mailing list)
On 1/4/06, Andreas S. <f@andreas-s.net> wrote:
> > David Black in the other thread[2] that this regex is indeed compliant
> > with a single, non-named address as defined by the RFC[3].
>
> Possibly. Still, I prefer a simple solution over a complicated one. What
> type of errors do you hope to catch with this huge regex? Typing errors?
> Deliberately entered rubbish? The regex accepts just about anything with
> a "@", e.g. "$@$".

Not possibly. Gauranteed. It's compliant to the portions of the RFC I
mentioned.

Still, I'll concede it doesn't prevent rubbish from being entered. The
domain of valid email addresses is much larger than the domain of
*actual* email addresses. I'm not claiming that this regex should even
be used for form validation. I dislike email validation period. My
intent in first writing the regex two years ago and bringing it up
again now is mostly:

1) To show off my regex-fu
2) To demonstrate the inadequacy of simplistic regexes for email
validation.

For instance, I'll often use the "name+tag@domain" construct to filter
mail and/or determine who's selling my address. When I find a form
that claims that email address is invalid, I get upset. As such, I've
taken it as my own personal crusade to punch down inadequate email
validations whenever I see them. My method is to demonstrate a regex
that does allow valid addresses. My first hope is that they'll notice
the futility and just remove the email address validation altogether.
If that fails, I hope they'll actually use the compliant regex.

The only reason I defended the regex was because you claimed it was
invalid. If you're original argument had been that the regex was
unnecessary, I'd probably have agreed with you. Validating email
addresses by form is pointless. If someone doesn't want to give you
their address, they won't. Requiring them to input a valid fake
address instead of an invalid fake address doesn't improve your data
at all. The only reason I can see that being necessary is to prevent
malformed addresses from breaking your application in some way. But if
that's a problem, fix the application, not the email address.

Jacob Fugal
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2006-01-04 19:30
(Received via mailing list)
From: "Jacob Fugal" <lukfugl@gmail.com>
>
> I figure people aren't going to type the "John
> Doe <john@doe.com>" format into a form

In my experience, a certain percentage do.  I'm guessing
it might be because they copied their email address out
of something like Outlook Express, and pasted it into the
form.  (OE will display "John Doe" as a hyperlink, which if
selected and copied turns into "John Doe <john@doe.com>"
in the clipboard.)

Personally, right or wrong, to catch that I just reject
email addresses with a "<" or ">" in them.  I'll admit I
don't really care if some spec says it's possible to legally
form email addresses with those characters.  That may make
me a bad person.  :)  But whoever wrote that spec should be
infested with the fleas of 1000 camels.


Regards,

Bill
C1bcb559f87f356698cfad9f6d630235?d=identicon&s=25 Hal Fulton (Guest)
on 2006-01-04 20:12
(Received via mailing list)
Bill Kelly wrote:
>  But whoever wrote that spec should be
> infested with the fleas of 1000 camels.

He probably already is.


Hal
912c61d9da47754de7039f4271334a9f?d=identicon&s=25 unknown (Guest)
on 2006-01-04 20:15
(Received via mailing list)
Quoting "Andreas S." <f@andreas-s.net>:

> It is trivial to create a formally correct address that makes
> absolutely no sense, so what's the point of doing such a
> complicated and error-prone validation?

Well, I might actually have one.

The comment form on my web site sends email directly to me; as a
convenience, the email address entered on the form becomes the
email's From address (I can see who it's from and reply more
easily).

Now, doing that would open up all sorts of injection attacks if I
didn't do any validation.  So I do a quick and paranoid (syntactic)
validity check -- if the address fails, then it is included in the
body of the message instead of a header field.

In this case, a nonsensical address is perfectly fine (I will see it
and know better), and it's even okay if a valid address is rejected
(I'll still get the message and be able to figure things out from
the body), but I have to be able to detect syntactically invalid
addresses.

-mental
B000982a23d5c6a34292902caf225dd7?d=identicon&s=25 Yohanes Santoso (Guest)
on 2006-01-04 20:30
(Received via mailing list)
Hal Fulton <hal9000@hypermetrics.com> writes:

> Bill Kelly wrote:
>>  But whoever wrote that spec should be
>> infested with the fleas of 1000 camels.
>
> He probably already is.
>
>
> Hal

In any case, many of the syntax put in rfc822 has been obseleted in
rfc2822.

The complexity of RFC822 (year 1982) was because the need to
interoperate with wildly different systems. Consider that the RFC
title was: "STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT
MESSAGES". As if there was other Internet such that it was needed to
specify which Internet.

Consider the title for RFC2822 (year 2001): "Internet Message Format"
where it was already clear that the ARPA Internet was the winner and
thus can afford to simplify the address syntax.

YS.
3a83969376c805ef5b6042191fdb0ff3?d=identicon&s=25 Andreas S. (andreas)
on 2006-01-04 22:00
Jacob Fugal wrote:
> The only reason I defended the regex was because you claimed it was
> invalid.

I don't remember that. I dislike complex solutions like this Regex
because they are error prone (as proved by your correction for Tim's
rfc822.rb), I didn't claim yours was invalid.

> If you're original argument had been that the regex was
> unnecessary, I'd probably have agreed with you. Validating email
> addresses by form is pointless. If someone doesn't want to give you
> their address, they won't. Requiring them to input a valid fake
> address instead of an invalid fake address doesn't improve your data
> at all. The only reason I can see that being necessary is to prevent
> malformed addresses from breaking your application in some way. But if
> that's a problem, fix the application, not the email address.

I totally agree with you.
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-01-04 22:31
(Received via mailing list)
On Thu, Jan 05, 2006 at 03:27:48AM +0900, Bill Kelly wrote:
> in the clipboard.)
It's possible that someone might copy and paste something in that format
from a number of other, non-Windows email clients, too -- though it's
more difficult to do so by accident, since generally clients like mutt
won't drop more stuff into your copy/paste buffer than you actually
highlighted.

--
Chad Perrin [ CCD CopyWrite | http://ccd.apotheon.org ]

unix virus: If you're using a unixlike OS, please forward
this to 20 others and erase your system partition.
6979e3d7e89e66d34d9a767e4d9a07bf?d=identicon&s=25 Jeffrey Moss (Guest)
on 2006-01-04 23:29
(Received via mailing list)
Here's my useful form validation:

/^\s*([-a-z0-9&\'*+.\/=?^_{}~]+@([a-z0-9]([-a-z0-9]{0,61}[a-z0-9])?\.)+[a-z]{2,5}\s*(,\s*|\z))+$/i

It may not catch EVERYTHING, but should work just fine for most
people. It will allow multiple email addresses separated by commas.

I figure if you want to go beyond that, a verification system would be
the next logical step.

-Jeff
E7559e558ececa67c40f452483b9ac8c?d=identicon&s=25 unknown (Guest)
on 2006-01-04 23:32
(Received via mailing list)
On Jan 4, 2006, at 1:27 PM, Bill Kelly wrote:
> Personally, right or wrong, to catch that I just reject
> email addresses with a "<" or ">" in them.  I'll admit I
> don't really care if some spec says it's possible to legally
> form email addresses with those characters.  That may make
> me a bad person.  :)

It doesn't make you a bad person but it certainly makes
your application less interoperable than it might be.
For example, the vast majority of email addresses on this
mailing list are of the form:

	Bill Kelly <billk@cts.com>

In some GUI environments it is harder to select the portion
between the <>'s than to select the entire address.

 From RFC 1123:

   At every layer of the protocols, there is a general rule whose
   application can lead to enormous benefits in robustness and
   interoperability:

     "Be liberal in what you accept, and conservative
      in what you send"


Gary Wright
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2006-01-05 00:51
(Received via mailing list)
From: <gwtmp01@mac.com>
> For example, the vast majority of email addresses on this
> mailing list are of the form:
>
> Bill Kelly <billk@cts.com>
>
> In some GUI environments it is harder to select the portion
> between the <>'s than to select the entire address.

I'd agree that ideally my web form should be smart enough to
handle that.  I started out with no validation at all, and
only added the <> rejection after observing the occasional
submit with that syntax causing a bounced email.

As I recall, I asked my then-employer what degree of
thoroughness he wanted me to invest in coding the email
validation logic, the answer was something like, "Just give
them the chance to enter it again, properly, in the manner
requested.  If they can't follow simple directions then I
don't think we want them using our software."  Woot!  ;)

> From RFC 1123:
>
>   At every layer of the protocols, there is a general rule whose
>   application can lead to enormous benefits in robustness and
>   interoperability:
>
>     "Be liberal in what you accept, and conservative
>      in what you send"

<font size="+1"><h2>agreed in general</font>, <p>but I don't
think everyone is in agreement that browsers' willingness to
<html><u>render this muck<body> </u>have resulted in a net</html>
benefit for mankind - although, i have</p> heard it argued
</body>both ways.<head>

:)


Regards,

Bill
Fd22ee3cfc7dac283ce8e451af324f7d?d=identicon&s=25 Chad Perrin (Guest)
on 2006-01-05 03:51
(Received via mailing list)
On Thu, Jan 05, 2006 at 08:48:48AM +0900, Bill Kelly wrote:
> <font size="+1"><h2>agreed in general</font>, <p>but I don't
> think everyone is in agreement that browsers' willingness to
> <html><u>render this muck<body> </u>have resulted in a net</html>
> benefit for mankind - although, i have</p> heard it argued
> </body>both ways.<head>

You could always just specifically disallow non-standards-compliant
(X)HTML, though depending on how you handle that it might end up
rejecting a lot of stuff meant for IE and OE that could be of use to you
(depending on what you find useful).

--
Chad Perrin [ CCD CopyWrite | http://ccd.apotheon.org ]

This sig for rent:  a Signify v1.14 production from
http://www.debian.org/
Cff9eed5d8099e4c2d34eae663aae87e?d=identicon&s=25 Jacob Fugal (Guest)
on 2006-01-05 19:57
(Received via mailing list)
On 1/4/06, Andreas S. <f@andreas-s.net> wrote:
> Jacob Fugal wrote:
> > The only reason I defended the regex was because you claimed it was
> > invalid.
>
> I don't remember that. I dislike complex solutions like this Regex
> because they are error prone (as proved by your correction for Tim's
> rfc822.rb), I didn't claim yours was invalid.

Ok, checking back on the flow here, this is what I saw:

  [Jacob] Be careful with email validation via regex, it's harder than
  you might think: <example regex>

  [Andreas] It is trivial to create a formally correct address that
  makes absolutely no sense, so what's the point of doing such a
  complicated and *error-prone validation*?

  [Tim] By "error prone" do you mean that it won't detect addresses
  that don't exist?

  [Andreas] No, I mean that *it might declare some addresses invalid*
  although they aren't.

In my mind, due to the use of pronouns, I believed the "error-prone
validation [that] might declare some addresses invalid" referred to my
example regex. Apparently they referred instead to inadequate regex
validation in general. Sorry for the confusion.

Jacob Fugal
93139b2c9893fd7dfafba4090db346c9?d=identicon&s=25 Shot - Piotr Szotkowski (Guest)
on 2006-01-06 13:02
(Received via mailing list)
Hello.

Jacob Fugal:

> Be careful with email validation via regex, it's harder than you might
> think[1][2]:
>
> /^([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\x0C\x0E-\x21\x23-\x5B\x5D
> -\x7F]|\\[\x00-\x7F])*")(\.([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\
> x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"))*@([a-zA-Z0-9&_?\/`!
> |#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])(\.
> ([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[
> \x00-\x7F])*\]))*$/

It does match
" spaces! @s! \"escaped quotes!\" "@shot.pl
and it's the first one doing this that I know of, kudos!

Unfortunately, it does not match 'international' domains, so
it wouldn't pass addresses in the domain of, say, g¿eg¿ó³ka.pl

Cheers,
-- Shot
Cff9eed5d8099e4c2d34eae663aae87e?d=identicon&s=25 Jacob Fugal (Guest)
on 2006-01-06 17:57
(Received via mailing list)
On 1/6/06, Shot - Piotr Szotkowski <shot@shot.pl> wrote:
> > |#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])(\.
> > ([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[
> > \x00-\x7F])*\]))*$/
>
> It does match
> " spaces! @s! \"escaped quotes!\" "@shot.pl
> and it's the first one doing this that I know of, kudos!

Not the first, I've been preceded by others that are even more correct
(and complex) :). Particularly:

  http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

> Unfortunately, it does not match 'international' domains, so
> it wouldn't pass addresses in the domain of, say, g¿eg¿ó³ka.pl

Good point. When I wrote this expression, I was only considering ASCII
characters in the 0x00-0x7F (0-127 decimal, which doesn't include
extended characters). Looking back at RFC822, it looks like that RFC
is likewise limited. It has no support for extended ASCII or UNICODE.
This is reasonable, based on the age of the RFC (1982).

As I understand from Yohanes' post in this thread, RFC2822 (2001)
supercedes RFC822, so I assume RFC2822 probably takes extended ASCII
-- and hopefully UNICODE, as well -- into account. Time to update the
regex! I'll leave it to someone else, however. ;)

Jacob Fugal
3cb4fdcf13aad6a7dcae83876b0e784e?d=identicon&s=25 Josef 'Jupp' SCHUGT (Guest)
on 2006-01-06 23:19
(Received via mailing list)
Hi!

At Wed, 4 Jan 2006 21:47:34 +0900, Andreas S. wrote:

> It is trivial to create a formally correct address that makes
> absolutely no sense, so what's the point of doing such a complicated
> and error-prone validation?

To give one example: On German keyboards "@" is entered using
"AltGr-q". If one releases "AltGr" before pushing "q" (which may well
happen if you type the quick-and-dirty way) "nobody@example.com"
becomes "nobodyqexample.com".

Also one should keep in mind the three commandments of distrust:

1. He who inputs is guilty.

2. He who inputs remains guilty unless he proofs that he is *not*
   guilty.

3. If the proof under rule 2 leaves any doubt (no matter how tiny it
   may be) the first rule applies.

In short: Input is evil unless you know for sure that it is not.

Josef 'Jupp' Schugt
64257a2a5b9673e519db55dbd4bd9d42?d=identicon&s=25 Tim Fletcher (Guest)
on 2006-01-09 09:40
(Received via mailing list)
The full RFC2822 regex is too big, but RMail has a parser for it.
64257a2a5b9673e519db55dbd4bd9d42?d=identicon&s=25 Tim Fletcher (Guest)
on 2006-01-09 10:01
(Received via mailing list)
The full RFC2822 regex is too big, but RMail has a parser for it.
Ec9233451f7c6ba37a83388b87a1f565?d=identicon&s=25 Gavin Kistner (Guest)
on 2006-01-10 00:38
(Received via mailing list)
On Jan 4, 2006, at 12:12 PM, mental@rydia.net wrote:
> Quoting "Andreas S." <f@andreas-s.net>:
>
>> It is trivial to create a formally correct address that makes
>> absolutely no sense, so what's the point of doing such a
>> complicated and error-prone validation?

For example, a friend of mine has the email address:

?@hisdomain.net

(The domain above was changed to protect his privacy. But the single
question mark as the 'username' is all that he has :)
Cff9eed5d8099e4c2d34eae663aae87e?d=identicon&s=25 Jacob Fugal (Guest)
on 2006-01-10 01:02
(Received via mailing list)
On 1/9/06, Gavin Kistner <gavin@refinery.com> wrote:
>
> (The domain above was changed to protect his privacy. But the single
> question mark as the 'username' is all that he has :)

And my regex matches that address. :)

Jacob Fugal
This topic is locked and can not be replied to.