JSON::ParserError in controller

Hi All
I’m trying to build an application which requires to scrap information
from a webpage. On trying to perform the action, I get an error while
trying to convert the html data to JSON. Has anyone experienced this
before and if so can you please tell me how to solve this problem ?
Please see below for code snippet and error log.

Thanks in advance
Anush

require ‘net/http’
require ‘open-uri’
require ‘uri’
require ‘json’
require ‘pp’

class Merchant < ActiveRecord::Base

def self.grab_original_content
## EXAMPLE USING ZED451.COM
uri = URI(“http://www.zed451.com”)
response = Net::HTTP.get_response(uri)
@hash = JSON(response.body)
puts “#{@hash}”
end

end

I call the above method in my controller and send @hash to view.
In my browser I see the below error:

JSON::ParserError in Original contentController#index

706: unexpected token at '

And the rest of the page is printed without error in html format.

Hai,

On Mon, Jan 7, 2013 at 3:01 AM, Anush J. [email protected] wrote:
I call the above method in my controller and send @hash to view.
In my browser I see the below error:

JSON::ParserError in Original contentController#index

706: unexpected token at 'Transitional//EN"
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>

And the rest of the page is printed without error in html format.

It’s printed out as HTML because it is HTML. HTML is not JSON and vice
verse. If you wish to parse the page as it is you need to use something
like Nokogiri so it gets tokenized, if you expected JSON you should
contact them and ask them what went wrong.


Jordon B.

https://twitter.com/envygeeks

Hi Jordon,
Thanks for your response.
I thought the JSON(response.body) performs the conversion of HTML->JSON.
But I also tried response.body.to_json which gave me the same error.
Will be great if you can explain a bit. Mean while I will also try using
nokigiri.

Thanks
Anush

Jordon B. wrote in post #1091317:

Hai,

On Mon, Jan 7, 2013 at 3:01 AM, Anush J. [email protected] wrote:
I call the above method in my controller and send @hash to view.
In my browser I see the below error:

JSON::ParserError in Original contentController#index

706: unexpected token at 'Transitional//EN"
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”>

And the rest of the page is printed without error in html format.

It’s printed out as HTML because it is HTML. HTML is not JSON and vice
verse. If you wish to parse the page as it is you need to use something
like Nokogiri so it gets tokenized, if you expected JSON you should
contact them and ask them what went wrong.


Jordon B.
http://envygeeks.com/
https://twitter.com/envygeeks

Dheeraj K. wrote in post #1091355:

You cannot convert HTML to JSON and vice versa. HTML is a markup
language, while JSON is a data interchange format.

You need to parse your HTML with Nokogiri or Hpricot, extract whatever
data you want from it and put it in a Hash, then call .to_json on it to
get the JSON response.


Dheeraj K.

Hi Dheeraj,
Ahh…I see. Got it now. Thanks, helps a lot in understanding.

Thanks
Anush

You cannot convert HTML to JSON and vice versa. HTML is a markup
language, while JSON is a data interchange format.

You need to parse your HTML with Nokogiri or Hpricot, extract whatever
data you want from it and put it in a Hash, then call .to_json on it to
get the JSON response.


Dheeraj K.