Forum: Ruby on Rails email address parse

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
E13ba8e702211be152c640df2d8e141f?d=identicon&s=25 Joe Black (joe-black)
on 2006-03-14 04:05
hi all
how to parse such email string to a array:
#--------------------------------------------
"joe black"<joe_1@joe_black.com> joe_2@joe_black.com,
joe_3@joe_black.com
#--------------------------------------------
seems emails from user input include various  format. and i have to
split them to a array.
any idea?

regards.
Bf66e10c8fc4abefebde0425e7f6f15a?d=identicon&s=25 Norman Timmler (Guest)
on 2006-03-14 10:54
(Received via mailing list)
Am Dienstag, den 14.03.2006, 04:05 +0100 schrieb Joe Black:
> hi all
> how to parse such email string to a array:
> #--------------------------------------------
> "joe black"<joe_1@joe_black.com> joe_2@joe_black.com,
> joe_3@joe_black.com
> #--------------------------------------------
> seems emails from user input include various  format. and i have to
> split them to a array.
> any idea?

string = "\"joe black\"<joe_1@joe_black.com> joe_2@joe_black.com,
joe_3@joe_black.com"
email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+\.[a-z]+)/i)

--
Norman Timmler

http://blog.inlet-media.de
685df76ebeb6c0d61ff5e0284830f691?d=identicon&s=25 Jim Cheetham (Guest)
on 2006-03-14 22:23
(Received via mailing list)
On Tue, Mar 14, 2006 at 10:53:18AM +0100, Norman Timmler wrote:
> string = "\"joe black\"<joe_1@joe_black.com> joe_2@joe_black.com,
> joe_3@joe_black.com"
> email_addresses = string.scan(/([a-z0-9_.-]+@[a-z0-9_-]+\.[a-z]+)/i)

That will catch many addresses, but it will exclude other perfectly
valid ones.

RFC2822 http://www.faqs.org/rfcs/rfc2822.html is the place to look, the
local part (left-hand-side) of an address can include :-
atext           =       ALPHA / DIGIT / ; Any character except controls,
                        "!" / "#" /     ;  SP, and specials.
                        "$" / "%" /     ;  Used for atoms
                        "&" / "'" /
                        "*" / "+" /
                        "-" / "/" /
                        "=" / "?" /
                        "^" / "_" /
                        "`" / "{" /
                        "|" / "}" /
                        "~"
as well as the "." character.

Hopefully there is some library that parses addresses into canonical
form that can be used, rather than a simple regexp?

-jim
Bf66e10c8fc4abefebde0425e7f6f15a?d=identicon&s=25 Norman Timmler (Guest)
on 2006-03-15 09:26
(Received via mailing list)
Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim Cheetham:
> atext           =       ALPHA / DIGIT / ; Any character except controls,
> as well as the "." character.
>
> Hopefully there is some library that parses addresses into canonical
> form that can be used, rather than a simple regexp?

@Jim
Is there such a library for ruby? Can you provide a link?

If a simple regexp is not enough for you, you can find a complex here:

http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

But i think this is a bit disproportionated.

@Joe
The regexp on the the top matches most email addresses in the wildlife
with nearly no loss. If you expect some of the special characters in the
local part of your email addresses, just add them to the regexp. It
should fit your needs.

--
Norman Timmler

http://blog.inlet-media.de
685df76ebeb6c0d61ff5e0284830f691?d=identicon&s=25 Jim Cheetham (Guest)
on 2006-03-15 22:13
(Received via mailing list)
On Wed, Mar 15, 2006 at 09:25:02AM +0100, Norman Timmler wrote:
> Am Dienstag, den 14.03.2006, 21:20 +0000 schrieb Jim Cheetham:
> > Hopefully there is some library that parses addresses into canonical
> > form that can be used, rather than a simple regexp?
>
> @Jim
> Is there such a library for ruby? Can you provide a link?

I have no idea - I would expect there is one, I hope someone on the list
could provide a reference :-)

Perhaps RhizMail.valid_address? would do it - I'm surprised no test is
to be found in ActionMailer, but I can't see one in the API docs.

> The regexp on the the top matches most email addresses in the wildlife
> with nearly no loss. If you expect some of the special characters in the
> local part of your email addresses, just add them to the regexp. It
> should fit your needs.

If you're not going to check the full validity of email addresses, you
should document it, accept it, and write some tests that clearly show
that failing to validate email addresses to the spec is *expected*
behaviour of your app.

There's nothing worse for a user to see their perfectly valid email
address rejected by a website, when there's nothing wrong with it.

-jim
4f553f0a2d333e688d639fbf6d2d889a?d=identicon&s=25 Brandon Keepers (Guest)
on 2006-03-15 22:38
(Received via mailing list)
I remember seeing a PHP script a while back that would actually
initiate a SMTP connection to the host to verify if the address was
correct.  I thought that was a pretty cool trick to actually verify
not only that the address was syntactically correct, but that it was
also a valid email address.  I'll have to see if I can dig it up.

Brandon
685df76ebeb6c0d61ff5e0284830f691?d=identicon&s=25 Jim Cheetham (Guest)
on 2006-03-15 22:50
(Received via mailing list)
On Wed, Mar 15, 2006 at 04:35:50PM -0500, Brandon Keepers wrote:
> I remember seeing a PHP script a while back that would actually
> initiate a SMTP connection to the host to verify if the address was
> correct.  I thought that was a pretty cool trick to actually verify

It's not possible to verify that an email address "actually exists".
There are lots of reasons for this, all to do with SMTP server
delivery behaviour, DNS failure and so on.

In any case, the AUP of many hosting services requires email
communications with customers to be double-opt-in, because if you're not
allowing the user to *confirm* that they want to receive email from your
app, it's spam, and the ISP might get blacklisted. Plus I believe there
are some laws governing this sort of thing in many jurisdictions.

So, when a user enters an email address into your app, that you intend
to use for sending messages later, you *should* :-
  * Send a message to them that they need to reply to
    * (Decide how hard you will try to deliver if there are problems.
	  Many people give up on the first failure, which is reasonable)
  * Wait for the reply, and change their status to 'verified'

This is supposed to help you verify that the user really wants mail at
that address -- which can eliminate the problem of someone using another
person's email address, either by accident of maliciusly.

If the only thing you intend to use the email address for is something
like lost password announcements, then don't bother checking too hard.
Make sure that you give them some other mechanism for recovering account
access -- like custom answers to questions, or direct contact with the
site administrators.

-jim
A2c85dc5ee81b12e3cc0a6522e8d079d?d=identicon&s=25 Chris Hall (Guest)
on 2006-03-15 23:02
(Received via mailing list)
http://www.regexlib.com/ is an excellent resource for regular
expressions.

example: http://regexlib.com/Search.aspx?k=rfc%202822

Chris
A049c3597983fdaaa2af2b0010c49abc?d=identicon&s=25 Tim Perrett (timperrett)
on 2006-04-23 14:18
Correct me if im wrong, but it is possible to check the domain of the
email using an MX record check, so if the domain is valid that will get
you a lot closer to establishing weather or not the email is.

However im not sure how this is implimented with  ruby, its possible on
unix based boxes using PHP.

Cheers
This topic is locked and can not be replied to.