JSON.parse and unicode escape?

The documentation for the ruby JSON classes (http://json.rubyforge.org/)
implies that it handles unicode escaping fine. But I’m having trouble
with parsing JSON with a unicode escape sequence in it. I am using the
‘ext’ parser (JSON::Ext::Parser) not the ‘pure’ parser. version 1.1.2,
which appears to still be the latest.

Here is some test JSON, that’s actually an excerpt from some JSON
returned to me by a third party web service. Finally boiled it down to
the simplest demonstration case. I saved it in a file, but here’s what’s
in the text file:

=====
{ “key”: ‘something \x26 more’ }

I believe that is valid json, containing an escaped unicode char? But
JSON.parse on that string throws, complaining:

JSON::ParserError: unexpected token at '{ “summary”: ’ \u0026 ’ }

I have verified it is the /x26 that’s doing it. It doesn’t like \x
escaped unicode.

Am I doing something wrong? Is the JSON I am receiving from the third
party bad somehow? This is such a widely used library that I’d be
surprised if it’s broken and can’t parse input including unicode escape
sequences… but that’s what it looks like to me. Feedback?

That’s not valid Unicode. See:

You can only have that code point in UTF-16

-Rob

I am running into what seems to be a related problem with the
following code:

irb

require ‘json’
=> true
JSON.parse(‘{“s”:“\uddb0”}’)
JSON::ParserError: source sequence is illegal/malformed near uddb0"}
from /Library/Ruby/Gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in
parse' from /Library/Ruby/Gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in parse’
from (irb):2
from :0

I don’t know enough about unicode to really understand what is being
escaped here, but the following unicode characters, very close in
range (I assume) do not throw an error:
“\ucdb0”, “\uedb0”, “\ud7b0”

I also validated the JSON string (‘{“s”:“\uddb0”}’) successfully at
http://www.jsonlint.com/ and in Python.

Any ideas of what might be the problem?
Are there any alternative JSON parsers for ruby?

Thank you very much // pascal

That makes a lot of sense. Thanks for the clarification regarding the
unicode range.

Since I don’t have control over the JSON source, I would like to try to
parse the JSON even if it results in a malformed unicode string. So
today I tried switching from ‘json’ to the ‘ruby-json’ library. After
some searching online, I didn’t find any documentation on how to use it
though. Primarily I don’t know how to include or require it.

require ‘ruby-json’
require ‘rubyjson’

don’t seem to work?
Any ideas are appreaciated.
Thank you very much
// pascal