Re: Current Temperature (#68)

davidtl · March 2, 2006, 3:39pm

Thanks James to point out legal issue.

My solution is just kind of Ruby exercice to show how to apply REXML
and open-uri.

Please consider my solution as an example of how to use REXML and
open-uri
to answer the Quiz and do NOT use my solution, because it has legal
issue.

I am declare here that I am not responsible for legal problem of usage
my solution.

Sorry for the trouble.

davidtl · March 2, 2006, 8:39pm

It was good of James to mention it in the summary, but in reality I
think it is “fair use” to scrape a page, regardless of what the site
owner says. People use ad-filters all the time. Just because someone
has a page with ads does not mean people should be forced to view
them. Providing publicly accessible information on the internet
carries the risk that people will use it in ways you did not intend
(much like publishing a book.)

I think scraping only becomes a problem when you are doing it as part
of a commercial enterprise (which maybe is what James meant.) Much
like copyright infringement is really only an issue when profit is
involved for the copier.

But I’d love to see Google sue a single person for some little
scraping utility that they use. It seems ridiculous to me.

Ryan

davidtl · March 2, 2006, 9:35pm

Ryan L. wrote:

Much like copyright infringement is really only an issue when profit is
involved for the copier.

Or loss of profit to the original publisher.

If I buy a book and make 100 copies and give them away for free, that’s
still not legal because the publisher was entitled to sell those copies.

But overall I agree, if you’re not making money on it and you’re not
causing anyone else to lose revenue, then it’s probably fair use.

Jeff

davidtl · March 2, 2006, 11:23pm

Jeff C. [email protected] writes:

causing anyone else to lose revenue, then it’s probably fair use.
Now, let’s consider screen-scraping really is “quoting”. With proper
attribution, that’s fair use… IANAL.

davidtl · March 3, 2006, 12:03am

Jeff C. [email protected] writes:

One reason I decided not to have ads on my hockey site
(www.hawksfans.com) is because I’m scraping RSS feeds, and even though
everything is clearly attributed with a link back to the original
articles, I just wanted to be on the safe side.

Let’s sue Technorati! And Google Feeds!

If you provide feeds, you want them scraped…

davidtl · March 2, 2006, 11:38pm

Christian N. wrote:

Jeff C. [email protected] writes:

causing anyone else to lose revenue, then it’s probably fair use.
Now, let’s consider screen-scraping really is “quoting”. With proper
attribution, that’s fair use… IANAL.

Um, I don’t know about that, actually. You have to make sure you’re not
taking money away from the original publisher.

For example, I could easily write a site which lets you type a google
search into a text box. I could make the google http get call myself,
scrape the html, and then present the results without all of the ads
that usually appear in the right-hand margin.

Even if I say at the top, “Results provided by Google”, do you think
it’s ok for me to do that? I’m using Google’s hard work for my benefit
and they would lose money because of it.

One reason I decided not to have ads on my hockey site
(www.hawksfans.com) is because I’m scraping RSS feeds, and even though
everything is clearly attributed with a link back to the original
articles, I just wanted to be on the safe side.

By the way, I really do think most of us doing html scrapes are doing it
fairly… my apologies if I took us too far off-topic.

Jeff