Decoding Unicode URLs Properly

Please help,

I have an application I switched over to UTF-8, but I am unable to
decode URLs with unicode characters in them properly. For instance, if
I have “item=Pur%E9ed” as a parameter in my url, when it’s decoded by
rails, it returns “Pur?ed” and not “Puréed”.

Is there an easy way to fix my controller to accept these characters
properly?

Or is there perhaps a plugin someone wrote that is less extreme then
unicode_hacks?

Thanks in advance,

Gregg P.
Orlando Ruby U. Group
http://www.ORUG.org

On Wed Jun 28, 2006 at 01:23:42AM +0200, Gregg P. wrote:

Or is there perhaps a plugin someone wrote that is less extreme then
unicode_hacks?

i think you have to wait for Ruby to suport UTF-8, or resort to various
hacks/patches/plugins. i made a local file browser in rails the other
day and ran into this same issue…

But there has to be a way to simply decode HTML parameters in the URL.

The problem with doing this in a controller action, is that it’s already
too late. Rails has already attempted to decode it, and a ? appears
instead of the %E9.

I guess I’ll have to dive into ActionController rails code?

carmen wrote:

On Wed Jun 28, 2006 at 01:23:42AM +0200, Gregg P. wrote:

Or is there perhaps a plugin someone wrote that is less extreme then
unicode_hacks?

i think you have to wait for Ruby to suport UTF-8, or resort to various
hacks/patches/plugins. i made a local file browser in rails the other
day and ran into this same issue…

carmen wrote:

On Wed Jun 28, 2006 at 01:23:42AM +0200, Gregg P. wrote:

Or is there perhaps a plugin someone wrote that is less
extreme then
unicode_hacks?

i think you have to wait for Ruby to suport UTF-8, or resort to
various
hacks/patches/plugins. i made a local file browser in rails the other
day and ran into this same issue…

On Jun 27, 2006, at 8:11 PM, Gregg P. wrote:

But there has to be a way to simply decode HTML parameters in the URL.

The problem with doing this in a controller action, is that it’s
already
too late. Rails has already attempted to decode it, and a ? appears
instead of the %E9.

I guess I’ll have to dive into ActionController rails code?

But %E9 isn’t a valid UTF-8 character. To encode é (U+00E9), you’d
need %C3%A9

$KCODE=‘u’
=> “u”

CGI::unescape(“item=Pur%E9ed”)
=> “item=Pur?ed”

parm = “item=Pur%C3%A9ed”
=> “item=Pur%C3%A9ed”

CGI::unescape(parm)
=> “item=Puréed”

CGI::escape(CGI::unescape(parm))
=> “item%3DPur%C3%A9ed”

Hmm… That works. Perhaps the problem is the user agent that encodes
the parameters into the URL? Or in making the assumption that the
original string is UTF-8.

-Rob