I am writing code to parse json response coming from easy interface
JSON.parse(e.response_body)
throws below error:
JSON::ParserError: 376: unexpected token at
Any clue about this? I try to force_encode e.response_body to utf-8 but
no success I really don’t know if i am looking at right place for
the solution, so i am posting here.
btw e.response_body.encoding? gives ASCII-8bit
Hi,
-
You really need to know ahead of time what encoding an external
source of data uses to read it properly. -
An encoding determines how many bytes are used to store a Unicode
integer(which represents a character). -
Forcing the encoding is probably not going to work 99% of the time.
When you force the encoding, you are telling ruby, “Hey, you made a
mistake, please treat this string as being encoded in some other
encoding.” What that means is this: suppose your original encoding
uses three bytes to store the Unicode integer:
0000 0000 0000 0000 0000 0001
and you tell ruby to read the string with an encoding that uses only two
bytes to store each Unicode integer, then ruby is going to read the
first two bytes of that Unicode integer and get:
0000 0000 0000 0000
which is 0. See the problem?
- Transcoding is a more likely solution. Transcoding involves trying
to convert the string from one encoding to another encoding. If
JSON.parse is expecting strings to be encoded in UTF-8, then that may
work. In the example above, transcoding from a 3 byte encoding to a 2
byte encoding would turn this:
0000 0000 0000 0000 0000 0001
into this:
0000 0000 0000 0001
- I have no idea what library the method JSON.parse() comes from, but
perhaps you could at least state what library you are using, and you
might consider reading that library’s docs or contacting the author.
Hi 7stud,
Thanks for the info. Really helpful.
Let me give some more info.
- You really need to know ahead of time what encoding an external
source of data uses to read it properly.
I am checking this one but encoding for data being received from
external source is ASCII-8bit but it should be UTF-8 as confirmed when i
make similar request from frontend page of our website. - Forcing the encoding
I agree with you that it’s not good idea as it mess data - Transcoding is a more likely solution
Yes, my JSON parse expecting data to be UTF-8 - I am using Json gem json (1.5.1) with pure variant
Rajesh H. wrote in post #1002455:
Hi 7stud,
Thanks for the info.
- What version of ruby are you using?
- What encoding is used for the data that is sent to you in JSON
format? - What output do you get for this:
$ ruby -e ‘puts Encoding.default_external.name’
7stud – wrote in post #1002417:
- Forcing the encoding is probably not going to work 99% of the time.
This pertains to ruby 1.9.
I don’t think that is true anymore. As far as I can tell, ruby will
tell you a string from an external source is encoded in ASCII-8BIT
whenever it encounters a string that does not use the ASCII
encoding(depending on your settings). In other words, if any
single byte contains an integer greater than 127, then ruby will tell
you the string is ASCII-8BIT, which simply seems to be a synonym for
“NOT-ASCII”, which in my opinion would be a much clearer indication of
what is going on.
A string labeled as “NOT-ASCII” cannot be transcoded to UTF-8 because
to transcode strings you have to specify both encodings: the original
encoding and the target encoding, and “NOT-ASCII” is not an encoding.
In that case, you would have to try forcing the encoding of the string,
and
then transcode the string to the target encoding. For instance, if you
know the incoming data is encoded is latin-1 extended(ISO-8859-1), then
depending on your settings ruby could label the encoding as “NOT-ASCII”.
If
the JSON.parse() method expects UTF-8 data, then you could first force
the
encoding of the incoming data to ISO-8859-1, and then transcode the data
to UTF-8.