Forum: Ruby on Rails email address parse

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Joe B. (Guest)
on 2006-03-14 05:05
hi all
how to parse such email string to a array:
#--------------------------------------------
"joe black"<joe_1@joe_black.com> joe_2@joe_black.com,
joe_3@joe_black.com
#--------------------------------------------
seems emails from user input include various  format. and i have to
split them to a array.
any idea?

regards.
Norman T. (Guest)
on 2006-03-14 11:54
(Received via mailing list)
Am Dienstag, den 14.03.2006, 04:05 +0100 schrieb Joe B.:
> hi all
> how to parse such email string to a array:
> #--------------------------------------------
> "joe black"<joe_1@joe_black.com> joe_2@joe_black.com,
> joe_3@joe_black.com
> #--------------------------------------------
> seems emails from user input include various  format. and i have to
> split them to a array.
> any idea?

string = "\"joe black\"<joe_1@joe_black.com> joe_2@joe_black.com,
joe_3@joe_black.com"
email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+\.[a-z]+)/i)

--
Norman T.

http://blog.inlet-media.de
Jim C. (Guest)
on 2006-03-14 23:23
(Received via mailing list)
On Tue, Mar 14, 2006 at 10:53:18AM +0100, Norman T. wrote:
> string = "\"joe black\"<joe_1@joe_black.com> joe_2@joe_black.com,
> joe_3@joe_black.com"
> email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+\.[a-z]+)/i)

That will catch many addresses, but it will exclude other perfectly
valid ones.

RFC2822 http://www.faqs.org/rfcs/rfc2822.html is the place to look, the
local part (left-hand-side) of an address can include :-
atext           =       ALPHA / DIGIT / ; Any character except controls,
                        "!" / "#" /     ;  SP, and specials.
                        "$" / "%" /     ;  Used for atoms
                        "&" / "'" /
                        "*" / "+" /
                        "-" / "/" /
                        "=" / "?" /
                        "^" / "_" /
                        "`" / "{" /
                        "|" / "}" /
                        "~"
as well as the "." character.

Hopefully there is some library that parses addresses into canonical
form that can be used, rather than a simple regexp?

-jim
Norman T. (Guest)
on 2006-03-15 10:26
(Received via mailing list)
Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim C.:
> atext           =       ALPHA / DIGIT / ; Any character except controls,
> as well as the "." character.
>
> Hopefully there is some library that parses addresses into canonical
> form that can be used, rather than a simple regexp?

@Jim
Is there such a library for ruby? Can you provide a link?

If a simple regexp is not enough for you, you can find a complex here:

http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

But i think this is a bit disproportionated.

@Joe
The regexp on the the top matches most email addresses in the wildlife
with nearly no loss. If you expect some of the special characters in the
local part of your email addresses, just add them to the regexp. It
should fit your needs.

--
Norman T.

http://blog.inlet-media.de
Jim C. (Guest)
on 2006-03-15 23:13
(Received via mailing list)
On Wed, Mar 15, 2006 at 09:25:02AM +0100, Norman T. wrote:
> Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim C.:
> > Hopefully there is some library that parses addresses into canonical
> > form that can be used, rather than a simple regexp?
>
> @Jim
> Is there such a library for ruby? Can you provide a link?

I have no idea - I would expect there is one, I hope someone on the list
could provide a reference :-)

Perhaps RhizMail.valid_address? would do it - I'm surprised no test is
to be found in ActionMailer, but I can't see one in the API docs.

> The regexp on the the top matches most email addresses in the wildlife
> with nearly no loss. If you expect some of the special characters in the
> local part of your email addresses, just add them to the regexp. It
> should fit your needs.

If you're not going to check the full validity of email addresses, you
should document it, accept it, and write some tests that clearly show
that failing to validate email addresses to the spec is *expected*
behaviour of your app.

There's nothing worse for a user to see their perfectly valid email
address rejected by a website, when there's nothing wrong with it.

-jim
Brandon K. (Guest)
on 2006-03-15 23:38
(Received via mailing list)
I remember seeing a PHP script a while back that would actually
initiate a SMTP connection to the host to verify if the address was
correct.  I thought that was a pretty cool trick to actually verify
not only that the address was syntactically correct, but that it was
also a valid email address.  I'll have to see if I can dig it up.

Brandon
Jim C. (Guest)
on 2006-03-15 23:50
(Received via mailing list)
On Wed, Mar 15, 2006 at 04:35:50PM -0500, Brandon K. wrote:
> I remember seeing a PHP script a while back that would actually
> initiate a SMTP connection to the host to verify if the address was
> correct.  I thought that was a pretty cool trick to actually verify

It's not possible to verify that an email address "actually exists".
There are lots of reasons for this, all to do with SMTP server
delivery behaviour, DNS failure and so on.

In any case, the AUP of many hosting services requires email
communications with customers to be double-opt-in, because if you're not
allowing the user to *confirm* that they want to receive email from your
app, it's spam, and the ISP might get blacklisted. Plus I believe there
are some laws governing this sort of thing in many jurisdictions.

So, when a user enters an email address into your app, that you intend
to use for sending messages later, you *should* :-
  * Send a message to them that they need to reply to
    * (Decide how hard you will try to deliver if there are problems.
	  Many people give up on the first failure, which is reasonable)
  * Wait for the reply, and change their status to 'verified'

This is supposed to help you verify that the user really wants mail at
that address -- which can eliminate the problem of someone using another
person's email address, either by accident of maliciusly.

If the only thing you intend to use the email address for is something
like lost password announcements, then don't bother checking too hard.
Make sure that you give them some other mechanism for recovering account
access -- like custom answers to questions, or direct contact with the
site administrators.

-jim
Chris H. (Guest)
on 2006-03-16 00:02
(Received via mailing list)
http://www.regexlib.com/ is an excellent resource for regular
expressions.

example: http://regexlib.com/Search.aspx?k=rfc%202822

Chris
Tim P. (Guest)
on 2006-04-23 16:18
Correct me if im wrong, but it is possible to check the domain of the
email using an MX record check, so if the domain is valid that will get
you a lot closer to establishing weather or not the email is.

However im not sure how this is implimented with  ruby, its possible on
unix based boxes using PHP.

Cheers
This topic is locked and can not be replied to.