Net:HTTP incorrect content retrieved

Hi All,

I am just trying to get some data from different website for studying
and testing. I could get web content successfully from some sites with
UTF-8 encoding.

However, when I try to access

http://bet.hkjc.com/football/index.aspx?lang=ch&pageno=1

I got below content from response.body (even though I use
force_encoding)

QtwpU(QutVP7vw qQQ03P)J+,KwSRP())///+7/J ee K)IQId&$PRZXYf[XWWbjkETnIe}e
E%!nH%eV#i}m @"KI-N.,]i5@(uAVbY"DTR,H!5/;bgQE9 i9i X5!Krl3RrQW\T\RaR SHOPWHMzI;L#!`AL0oKRKs2Rz>yJ-ILN:0=5/(1’l^bqA}Nb^mju-Pyr0!D\HXrC%BcDIbgRrbrFjXjQ1(}nfcddn0$$'vl9w+|9s5OwC(*eg%*$+de+d3AIA|Z~~IRbNNpI%( %C%[b\qFj*c$sJSRA*3!EnPx:!B- .,WJ&Ru v<yZ0’;>?%8
(34B.D"|#tSRKsJ(4\[email protected]’M(3!))y e_WPZB^N)BuMp52 =‘Pj"Sf<r(3)5)5
[email protected],[email protected]^M6hlfIjn1"gP$A=qURWDzeU! 1X m8zPT#F&[email protected]$c a$t
v=]\[email protected]:+RYTn!!r23J-TH?NGA~,HLO!L)qG-1 Q^ T:=gl0/5^[email protected],/V\XRZ[4'& %] pxQR[AL+*Y)[ /?0F$KTU(#-3/5EFr$\}PV_ZY\@h gZ"[email protected] xR)(@XPT [email protected])</! &5/9#!N}}LP>#1_GPPj8t,C'h+(+)hf2pZwIS* @!Z^mad6p|P&x) +
M/-PBW J: 8y(450Ks!tk$APh %g1?13Xs-P(R+wO*vAU/=3MIPQ& !95kOPJJ iz
K)$9998&&AP2V"t0 0R)%IYPS]<20514121UlDH5<H,[email protected]#E``U&j
U:[email protected]@[email protected]{j]2^%%qd#}q93’fSQ~yq*(1h%e’[z4p(VE& lx5* ,S5VPCO’%[email protected]@9,
5931’<5IXGEA `4,rD}xBR

what I expect (I checked in browser), should be below

try {var enableAccessControl =false;if (enableAccessControl) {var tmp = window.location.href.substr(7) ;var SERVER_NAME = tmp.substr(0, tmp.indexOf("/"));var domainName = SERVER_NAME.substr(SERVER_NAME.indexOf(".")+1) ;document.domain = domainName;try {if (!top.betSlipFrame.isLogon())window.location.replace("/general_index.aspx?lang=en");}catch (e) {window.location.replace("/general_index.aspx?lang=en");}}}

Do anyone has idea?

On Sat, May 17, 2014 at 10:28 PM, Nick T. [email protected] wrote:

I am just trying to get some data from different website for studying
and testing. I could get web content successfully from some sites with
UTF-8 encoding.

  1. Please post to the ruby mailing list - this is not a Rails question.

  2. You’ll get more usable responses if you include the code and
    the Ruby version you’re using to fetch the document.

FWIW,

Hassan S. ------------------------ [email protected]
http://about.me/hassanschroeder
twitter: @hassan

On Sunday, 18 May 2014 01:28:24 UTC-4, Ruby-Forum.com User wrote:

I got below content from response.body (even though I use
force_encoding)

QtwpU(QutVP7vw qQQ03P)J+,KwSRP())///+7/J ee K)IQId&$PRZXYf[XWWbjkETnIe}e

This is not an UTF-8 encoding issue - the server is sending the response
with Content-Encoding set to gzip.

On Ruby 2.0 and up, Net::HTTP will automatically decode gzip, earlier
versions will need to do it themselves:

http://stackoverflow.com/questions/13397119/ruby-nethttp-not-decoding-gzip

http://pushandpop.blogspot.com.au/2011/05/handling-gzip-responses-in-ruby-nethttp.html

–Matt J.

Matt, really thanks a lots.

and I will put the question correctly into ruby path

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs