Forum: Ruby on Rails Receiving windows-1252 or iso-8859-1 via TMail

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Joost B. (Guest)
on 2005-12-08 16:52
My ActionMailer receives emails fine. I can access the headers, body and
attachment(s). However, this only works flawlessly when the emails are
in the US-ASCII charset:

Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed


When they're either in windows-1252 or iso-8859-1, the non-ascii
characters are misrepresented or simply printed as question marks. Has
anyone succesfully received emails in charsets other than US-ASCII? No
API docs, manuals or wikis mention this. The only TMail doc I could find
(from 2004) mentions that the author is not planning to add i17n (sic)
support because it's too complex for him. I don't know what that means.


As an aside, internationalisation and hi-ascii character sets seem to be
an afterthought in Rails. Am I alone in this?
scott (Guest)
on 2005-12-09 07:03
(Received via mailing list)
On Dec 8, 2005, at 9:52 AM, joost baaij wrote:

> anyone succesfully received emails in charsets other than US-ASCII? No
> API docs, manuals or wikis mention this. The only TMail doc I could
> find
> (from 2004) mentions that the author is not planning to add i17n (sic)
> support because it's too complex for him. I don't know what that means.

I have not actually done it, but check out the Mailr project.  I believe
Luben has code in it to properly handle other encodings.

> As an aside, internationalisation and hi-ascii character sets seem to
> be
> an afterthought in Rails. Am I alone in this?

A debate hashed out numerous times on the list, search the archives for
various points of view.

--
Scott B.
Lunchbox Software
http://lunchboxsoftware.com
http://lunchroom.lunchboxsoftware.com
http://rubyi.st
Andreas S. (Guest)
on 2005-12-10 02:16
joost wrote:
> My ActionMailer receives emails fine. I can access the headers, body and
> attachment(s). However, this only works flawlessly when the emails are
> in the US-ASCII charset:
>
> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
>
>
> When they're either in windows-1252 or iso-8859-1, the non-ascii
> characters are misrepresented or simply printed as question marks.

The mailer converts all received mails to UTF-8.

> Has
> anyone succesfully received emails in charsets other than US-ASCII?

Yes. www.ruby-forum.com receives all kind of weird charsets.

> No
> API docs, manuals or wikis mention this. The only TMail doc I could find
> (from 2004) mentions that the author is not planning to add i17n (sic)
> support because it's too complex for him. I don't know what that means.

I'm afraid the TMail delivered with Rails doesn't have too much in
common with the original TMail anymore. Unfortunately I haven't seen any
efforts to merge the changes into the original TMail.

> As an aside, internationalisation and hi-ascii character sets seem to be
> an afterthought in Rails. Am I alone in this?

There is no problem with character sets.
Joost B. (Guest)
on 2005-12-10 03:06
andreas wrote:
> The mailer converts all received mails to UTF-8.

Ouch! The blinding flash of insight just hit me. Thanks so much, with
this info I could solve the issue immediately. Since my tables/model are
in iso-8859-1 and no way for me to change that, I simply convert the
UTF-8 and all is good.

The relevant snippet:

require 'iconv'
# Receive a message
def receive(email)
  # Transcode to iso-8859-1
  body = Iconv.new('iso-8859-1', 'utf-8').iconv(email.body)

  # handle the email...
end
takahashimm (Guest)
on 2005-12-10 05:58
(Received via mailing list)
2005/12/10, Andreas S. <removed_email_address@domain.invalid>:
> > No
> > API docs, manuals or wikis mention this. The only TMail doc I could find
> > (from 2004) mentions that the author is not planning to add i17n (sic)
> > support because it's too complex for him. I don't know what that means.

I have talked with Minero about I18n in TMail. He said
he would support I18N when Ruby supports it.
For example, when you encode a long subject, you should split
a string correctly. But in current Ruby, you can do it only in
ASCII, UTF-8, EUC-JP and Shift_JIS with NKF (I guess
Iconv does not use for spliting a string with correct character
boundary, right?).

> I'm afraid the TMail delivered with Rails doesn't have too much in
> common with the original TMail anymore. Unfortunately I haven't seen any
> efforts to merge the changes into the original TMail.

In RubyConf2005, Minero and DHH had talked about TMail
(mainly license, not merging).

> > As an aside, internationalisation and hi-ascii character sets seem to be
> > an afterthought in Rails. Am I alone in this?

gorou-san have developed ActiveHeart plugin, Japanization module for
Rails
(ActiveRecrod and ActionMailer). It's like hack, but very useful for
Japanese
Railers.

http://svn.rails2u.com/public/plugins/trunk/active_heart/

If you want to handle some encodings (not every encodings), plug-in
approch is not so bad. Complete I18N framework is not easy (and fun),
like complete login framework.

Regards,

Masayoshi T.
Joost B. (Guest)
on 2005-12-12 12:45
Iconv has its limits, but working internally with UTF-8 is probably the
best way to go. Thanks everyone.
This topic is locked and can not be replied to.