Hi. I’m trying to scrape scripts from a news site that follows.
As you can see, this site is very unique that a news video runs
together with the corresponding scripts. And I want to scrape the
So I wrote this simple code.
if ARGV.size > 0 then
url = ARGV
html = Nokogiri::HTML(open(url))
p html.search("//span[@class=‘segment sec5’]").text
in the command,
$> ruby code.rb URL
then, it does obtain a partial script (eventually I want to get all
of them of course), which in this case is “NBC”. But the behavior of
the code is funny. The process starts itself, and in a while it shows
the result but does not seem to end itself. By hitting “RET”, it
manipulate the video.
Do you think we can avoid such behavior and get the scripts as quickly
Thanks in advance.