Safari utf8 fix - why?

Hello!

Could somebody direct me to the explanation of Safari UTF8 fix in
browser_filters plugin, namely WHY we need that? For what version of
Safari etc…

There is a mention of a Safari bug on the wiki, but nothing specific,
and google tells me nothing, and plugin code is silent, too…

I ask because I am working on an app which sometimes returns some
JavaScript code in xmlhttprequest response and this Safari fix breaks
the strings inside the code.

This application also uses form_remote with Ajax helpers, and I had
no problem with various strange characters before the fix (MacOSX
10.4.3, Safari 2.0.2). I know for sure, because these characters are
now encoded and break my javascript code, but not results of my forms
(without plugin both work OK).

Does an application need this fix if users are guaranteed to run
latest Safari?

izidor

On 3-dec-2005, at 11:34, Izidor J. wrote:

Hello!

Could somebody direct me to the explanation of Safari UTF8 fix in
browser_filters plugin, namely WHY we need that? For what version
of Safari etc…
I think it’s for versions of Safari older than 2.0, and what it does
is it replaces all the “higher” characters (above ASCII range) that
are UTF-8 chars into their decimal entity equivalents (थ)

I assume the JS parser in Safari barks on those, but I need to make
some tests to verify it. I think the problem sums up to the fact that
older versions of
Safari always assume AJAX responses to be ISO-encoded UNLESS they
contain an xml prolog with charset in it. However, inserting an XML
prolog
most likely make Safari try to parse it accordingly. I also had
problems with this on Firefox (albeit of different kind) - the thing
is XMLHTTPRequest looks to be meant to transport, well, XML, and
browsers look for the prolog. So there are a few fixes to try, I think:

  1. try to exclude all script tags from the entity encoding
  2. try to wrap the response into a bogus XML container or perpend a
    prolog to it (that’s what I was doing recently instead of reencoding
    the response):
    response.body =’<?xml version="1.0" encoding="utf-8"?>’ +
    response.body
  3. and of course test thoroughly on Panther

no problem with various strange characters before the fix (MacOSX
10.4.3, Safari 2.0.2). I know for sure, because these characters
are now encoded and break my javascript code, but not results of my
forms (without plugin both work OK).

Does an application need this fix if users are guaranteed to run
latest Safari?

I think it doesn’t, but not that many users will install Tiger (which
is the only way to get to Safari 2).

On 4-dec-2005, at 20:45, Izidor J. wrote:

quite obvious result, but I did the testing anyway).

If you can guarantee that your Apple users have latest Panther or
greater (or Safari 1.3 or greater), than you do not need this fix…

Or you can add checking into fix-code: besides string ‘AppleWebKit’
check also the AppleWebKit version number according to <http://
developer.apple.com/internet/safari/uamatrix.html>.

Thanks Izidor, this seems a lovely research! I wonder what is a
better fix for the problem for the old Safari’s though - prepend the
prolog or convert all into entities. Can you test these two
approaches if you have access to a Panther box somewhere?


Julian ‘Julik’ Tarkhanov
me at julik.nl

On 3-dec-2005, at 11:34, Izidor J. wrote:

Yes I forgot to mention that the bug itself is that Safari 1.x
ignored the charset directive embedded in the header of the response
recieved via AJAX.


Julian ‘Julik’ Tarkhanov
me at julik.nl

On Dec 4, 2005, at 4:41 AM, Julian ‘Julik’ Tarkhanov wrote:

Yes I forgot to mention that the bug itself is that Safari 1.x
ignored the charset directive embedded in the header of the
response recieved via AJAX.

Thank you for the answers. They really helped to start and focus my
research.

And this is what I found through testing: Safari 2.0 and 1.3 both
behave correctly (since they share the same WebKit, it seems like
quite obvious result, but I did the testing anyway).

This means that Apple users with Mac OS X 10.3.9 or greater are not
affected by this bug, because Safari was upgraded to 1.3 in Mac OS X
10.3.9.

But if you have Mac OS X 10.3.8 with default Safari, which is 1.2.4,
or smaller, this bug still bites you (verified for 10.3.8 with Safari
1.2.4, probably holds for lesser versions, too). So users with latest
Panther and Tiger upgrades are OK, Jaguar and previous are not.

If you can guarantee that your Apple users have latest Panther or
greater (or Safari 1.3 or greater), than you do not need this fix…

Or you can add checking into fix-code: besides string ‘AppleWebKit’
check also the AppleWebKit version number according to <http://
developer.apple.com/internet/safari/uamatrix.html>.

izidor

On Dec 4, 2005, at 10:44 PM, Julian ‘Julik’ Tarkhanov wrote:

Thanks Izidor, this seems a lovely research! I wonder what is a
better fix for the problem for the old Safari’s though - prepend
the prolog or convert all into entities. Can you test these two
approaches if you have access to a Panther box somewhere?

Well, for the test to be meaningful, you need to find the lowest
version of Safari that works (or, the highest that doesn’t work) with
a given fix. Since I only have access to Mac OS X 10.3.8, the result
would not be useful, because what about 10.3.0 or 10.2 users ?

I think that the convert-to-entities fix is good solution. If
somebody sends pure JavaScript code in responseText and then eval()'s
that (that’s what I do), then you just need to exclude the
after_filter for that particular method (or make sure your javascript
code is ascii only).

izidor