Email address parse


#1

hi all
how to parse such email string to a array:
#--------------------------------------------
“joe black”<joe_1@joe_black.com> joe_2@joe_black.com,
joe_3@joe_black.com
#--------------------------------------------
seems emails from user input include various format. and i have to
split them to a array.
any idea?

regards.


#2

Am Dienstag, den 14.03.2006, 04:05 +0100 schrieb Joe B.:

hi all
how to parse such email string to a array:
#--------------------------------------------
“joe black”<joe_1@joe_black.com> joe_2@joe_black.com,
joe_3@joe_black.com
#--------------------------------------------
seems emails from user input include various format. and i have to
split them to a array.
any idea?

string = ““joe black”<joe_1@joe_black.com> joe_2@joe_black.com,
joe_3@joe_black.com”
email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+.[a-z]+)/i)


Norman T.

http://blog.inlet-media.de


#3

On Tue, Mar 14, 2006 at 10:53:18AM +0100, Norman T. wrote:

string = ““joe black”<joe_1@joe_black.com> joe_2@joe_black.com,
joe_3@joe_black.com”
email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+.[a-z]+)/i)

That will catch many addresses, but it will exclude other perfectly
valid ones.

RFC2822 http://www.faqs.org/rfcs/rfc2822.html is the place to look, the
local part (left-hand-side) of an address can include :-
atext = ALPHA / DIGIT / ; Any character except controls,
“!” / “#” / ; SP, and specials.
“$” / “%” / ; Used for atoms
“&” / “’” /
“*” / “+” /
“-” / “/” /
“=” / “?” /
“^” / “_” /
“`” / “{” /
“|” / “}” /
“~”
as well as the “.” character.

Hopefully there is some library that parses addresses into canonical
form that can be used, rather than a simple regexp?

-jim


#4

Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim C.:

atext = ALPHA / DIGIT / ; Any character except controls,
as well as the “.” character.

Hopefully there is some library that parses addresses into canonical
form that can be used, rather than a simple regexp?

@Jim
Is there such a library for ruby? Can you provide a link?

If a simple regexp is not enough for you, you can find a complex here:

http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

But i think this is a bit disproportionated.

@Joe
The regexp on the the top matches most email addresses in the wildlife
with nearly no loss. If you expect some of the special characters in the
local part of your email addresses, just add them to the regexp. It
should fit your needs.


Norman T.

http://blog.inlet-media.de


#5

I remember seeing a PHP script a while back that would actually
initiate a SMTP connection to the host to verify if the address was
correct. I thought that was a pretty cool trick to actually verify
not only that the address was syntactically correct, but that it was
also a valid email address. I’ll have to see if I can dig it up.

Brandon


#6

On Wed, Mar 15, 2006 at 09:25:02AM +0100, Norman T. wrote:

Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim C.:

Hopefully there is some library that parses addresses into canonical
form that can be used, rather than a simple regexp?

@Jim
Is there such a library for ruby? Can you provide a link?

I have no idea - I would expect there is one, I hope someone on the list
could provide a reference :slight_smile:

Perhaps RhizMail.valid_address? would do it - I’m surprised no test is
to be found in ActionMailer, but I can’t see one in the API docs.

The regexp on the the top matches most email addresses in the wildlife
with nearly no loss. If you expect some of the special characters in the
local part of your email addresses, just add them to the regexp. It
should fit your needs.

If you’re not going to check the full validity of email addresses, you
should document it, accept it, and write some tests that clearly show
that failing to validate email addresses to the spec is expected
behaviour of your app.

There’s nothing worse for a user to see their perfectly valid email
address rejected by a website, when there’s nothing wrong with it.

-jim


#7

On Wed, Mar 15, 2006 at 04:35:50PM -0500, Brandon K. wrote:

I remember seeing a PHP script a while back that would actually
initiate a SMTP connection to the host to verify if the address was
correct. I thought that was a pretty cool trick to actually verify

It’s not possible to verify that an email address “actually exists”.
There are lots of reasons for this, all to do with SMTP server
delivery behaviour, DNS failure and so on.

In any case, the AUP of many hosting services requires email
communications with customers to be double-opt-in, because if you’re not
allowing the user to confirm that they want to receive email from your
app, it’s spam, and the ISP might get blacklisted. Plus I believe there
are some laws governing this sort of thing in many jurisdictions.

So, when a user enters an email address into your app, that you intend
to use for sending messages later, you should :-

  • Send a message to them that they need to reply to
    • (Decide how hard you will try to deliver if there are problems.
      Many people give up on the first failure, which is reasonable)
  • Wait for the reply, and change their status to ‘verified’

This is supposed to help you verify that the user really wants mail at
that address – which can eliminate the problem of someone using another
person’s email address, either by accident of maliciusly.

If the only thing you intend to use the email address for is something
like lost password announcements, then don’t bother checking too hard.
Make sure that you give them some other mechanism for recovering account
access – like custom answers to questions, or direct contact with the
site administrators.

-jim


#8

Correct me if im wrong, but it is possible to check the domain of the
email using an MX record check, so if the domain is valid that will get
you a lot closer to establishing weather or not the email is.

However im not sure how this is implimented with ruby, its possible on
unix based boxes using PHP.

Cheers


#9

http://www.regexlib.com/ is an excellent resource for regular
expressions.

example: http://regexlib.com/Search.aspx?k=rfc%202822

Chris