Hello,
I'm trying to retrieve search results from the internet using nokogiri
and open-uri. Apparently 'open-uri' can't handle directly UTF-8. So I'm
trying to convert the string in ASCII but still I come up with an error.
Here is the chunk of code:
----------------------------------------
# encoding: UTF-8
require "nokogiri"
require "open-uri"
word = "Ελληνικά"
ascii_word = word.force_encoding("ASCII").to_s
result = open("http://search.lycos.com/web?q=#{ascii_word}",
"User-Agent" => "HTTP_USER_AGENT:Mozilla/5.0 (Windows; U; Windows NT
6.0; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.47 S
doc = Nokogiri::HTML(result)
----------------------------------------
And the error I get is:
----------------------------------------
[...]:in `open': invalid byte sequence in US-ASCII (ArgumentError)
from lycos.rb:8:in `<main>'
----------------------------------------
I'm on MacOSX ML, using ruby (rvm) 1.9.3 .
I tried using 'force_encofing("US-ASCII")' but it's not a recognized
format. The word is Greek and uses UTF-8. Any ideas would be welcomed.
Thanks for your time,
Best Regards
Panagiotis (atmosx) Atmatzidis
email: atma@convalesco.org
URL: http://www.convalesco.org
GnuPG ID: 0xE736C6A0
gpg --keyserver x-hkp://pgp.mit.edu --recv-keys 0xE736C6A0
on 2012-11-07 17:08
on 2012-11-07 22:06
On Thu, 8 Nov 2012 01:07:41 +0900, Panagiotis Atmatzidis wrote: > ascii_word = word.force_encoding("ASCII").to_s > As per RFC (2396?), you need to encode the non-asci bit, thusly: #!/usr/bin/ruby # encoding: UTF-8 require "nokogiri" require "open-uri" word = URI.encode("Ελληνικά") result = open("http://search.lycos.com/web?q=#{word}", "User-Agent" => "HTTP_USER_AGENT:Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.47") doc = Nokogiri::HTML(result) puts doc -jh
on 2012-11-08 02:41
If one were to examine the ruby URI docs here: http://www.ruby-doc.org/stdlib-1.9.3/libdoc/uri/rd... how would one know that there is a method named URI.escape?
on 2012-11-08 04:35
On Thu, Nov 8, 2012 at 9:41 AM, 7stud -- <lists@ruby-forum.com> wrote: > If one were to examine the ruby URI docs, how would one know that there > is a method named URI.escape? > try eg, ri URI::Escape.escape best regards -botp
on 2012-11-09 00:58
botp wrote in post #1083491: > On Thu, Nov 8, 2012 at 9:41 AM, 7stud -- <lists@ruby-forum.com> wrote: >> If one were to examine the ruby URI docs, how would one know that there >> is a method named URI.escape? >> > > try eg, > > ri URI::Escape.escape > But to write that, you already have to know there is an escape() method in some namespace somewhere. How come when I look at the docs, the docs don't list the methods that I can call on URI? Isn't that the purpose of the docs?
on 2012-11-09 03:26
On Fri, Nov 9, 2012 at 7:58 AM, 7stud -- <lists@ruby-forum.com> wrote: > How come when I look at the docs, there isn't a list of methods that I can call on URI? > URI is big. as if now, we'll have to dig down further. $ ri URI | grep "* URI::" * URI::Generic (in uri/generic.rb) * URI::FTP - (in uri/ftp.rb) * URI::HTTP - (in uri/http.rb) * URI::HTTPS - (in uri/https.rb) * URI::LDAP - (in uri/ldap.rb) * URI::LDAPS - (in uri/ldaps.rb) * URI::MailTo - (in uri/mailto.rb) * URI::Parser - (in uri/common.rb) * URI::REGEXP - (in uri/common.rb) * URI::REGEXP::PATTERN - (in uri/common.rb) * URI::Util - (in uri/common.rb) * URI::Escape - (in uri/common.rb) * URI::Error - (in uri/common.rb) * URI::InvalidURIError - (in uri/common.rb) * URI::InvalidComponentError - (in uri/common.rb) * URI::BadURIError - (in uri/common.rb) you can creat a tiny script to drill down on those.. or you can use html ruby-doc wc has clickable links of the above.. ri isn't perfect like all other docs, but i guess you already knew that ;- ) best regards -botp
Please log in before posting. Registration is free and takes only a minute.
Existing account
(Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
Log in with Google account | Log in with Yahoo account
No account? Register here.