FeedTools

varac · July 27, 2006, 10:12pm

Can someone verify for me that feedtools is incorrectly parsing
http://torrentspy.com/rss.asp ? In my tests, feedtools can only find 1
entry for that rss. Other sites seem to work, so far.

TIA

varac · July 28, 2006, 1:02am

Ray C. wrote:

Can someone verify for me that feedtools is incorrectly parsing
http://torrentspy.com/rss.asp ? In my tests, feedtools can only find 1
entry for that rss. Other sites seem to work, so far.

TIA

Ray- it seems to be working for me, though if you picked up a new
version in the last few days, it was updated this afternoon to fix a
couple issues.

FeedTools::FEED_TOOLS_VERSION::STRING
=> “0.2.27”
f = FeedTools::Feed.open(‘http://torrentspy.com/rss.asp’)
=> #<FeedTools::Feed:0x-241bce82 URL:http://torrentspy.com/rss.asp>
f.items.size
=> 25
f.entries.size
=> 25

varac · July 28, 2006, 4:06am

Thanks a lot you guys. It seems that that specific problem was solved
when I updated feedtools through gems. I was using 0.2.25 and now I’m
on 0.2.26 I notice that the previous poster is on 0.2.27, but I am
trying to stick to gem installs.

For a follow-up question, anyone know of a good ruby feed validator?

Currently I crawl what I think are feeds using my own custom crawler:

feed = FeedTools::Feed.new
feed.feed_data = _my_data_returned_from_crawler

I would like to 1) validate the feed before submitting feed contents to
feedtools 2) try to extract a feed link out of
_my_data_returned_from_crawler in the case that it is not a valid feed.

I have seen the ruby extension for feedvalidator.org, but unfortunately
I it won’t work for me since I want to keep the validation on my local
machine.

Thanks a lot again.

varac · July 28, 2006, 9:01am

Hi,

Ray C. [email protected]:

Thanks a lot you guys. It seems that that specific problem was solved
when I updated feedtools through gems. I was using 0.2.25 and now I’m
on 0.2.26 I notice that the previous poster is on 0.2.27, but I am
trying to stick to gem installs.

This is not a problem of version 0.2.25:

irb(main):001:0> require ‘rubygems’
=> true
irb(main):002:0> require_gem ‘feedtools’
=> true
irb(main):003:0> FeedTools::FEED_TOOLS_VERSION::STRING
=> “0.2.25”
irb(main):004:0> f =
FeedTools::Feed.open(“http://torrentspy.com/rss.asp”)
=> #<FeedTools::Feed:0x-2439239a URL:http://torrentspy.com/rss.asp>
irb(main):005:0> f.items.size
=> 25
irb(main):006:0> f.entries.size
=> 25

Regards
Lutz

varac · July 28, 2006, 5:06am

Thanks a lot you guys. It seems that that specific problem was solved
when I updated feedtools through gems. I was using 0.2.25 and now I’m
on 0.2.26 I notice that the previous poster is on 0.2.27, but I am
trying to stick to gem installs.

For a follow-up question, anyone know of a good ruby feed validator?

There isn’t one. You need Python for that.

Currently I crawl what I think are feeds using my own custom crawler:
feed = FeedTools::Feed.new
feed.feed_data = _my_data_returned_from_crawler
I would like to 1) validate the feed before submitting feed contents to
feedtools 2) try to extract a feed link out of
_my_data_returned_from_crawler in the case that it is not a valid feed.

If you’re thinking about auto-discovery, FeedTools already has that
built-in. It just doesn’t work in some cases (for example, on
Engadget). This should be fixed in the next release, hopefully.

I have seen the ruby extension for feedvalidator.org, but unfortunately
I it won’t work for me since I want to keep the validation on my local
machine.

What ruby extension? Please send me a link.

Cheers,
Bob A.

varac · July 28, 2006, 10:09am

Hi Bob,

I was referring to this:
http://feedvalidator.rubyforge.org/

I’m not sure exactly how to use auto-discovery since I use my own
crawler for http retrieval. Can you point me at the method that does
the auto-discovery?

Perhaps the http://torrentspy.com/rss.asp case was broken for me since
my crawler returned a different output than what would have been gotten
using feedtools internals.

Ray

varac · July 28, 2006, 4:01pm

Hi Bob,

I was referring to this:
http://feedvalidator.rubyforge.org/

I’m not sure exactly how to use auto-discovery since I use my own
crawler for http retrieval. Can you point me at the method that does
the auto-discovery?

If FeedTools fails to parse it checks the mime-type of the thing it
retrieved and if it was an (x)html mime-type it looks for the
auto-discovery stuff. This logic mostly happens in the update!
method, but also partly in the HtmlHelper module.

That said, I’d recommend against using your own crawler for dealing
with feeds unless you’re only pulling the feed once. FeedTools has a
lot of functionality built in that’s designed to prevent feeds from
being unnecessarily polled, thus wasting people’s bandwidth. I
suppose you might have actually implemented all that functionality
yourself, but I’m going to go out on a limb and guess that you haven’t
added ETag/Last-Modified/If-None-Match/feed time-to-live support into
your crawler. Just be aware that failing to implement that stuff can
and will cost other people hundreds of dollars in bandwidth if you
poll too often without having that code in place. It’s not exactly
polite.

Perhaps the http://torrentspy.com/rss.asp case was broken for me since
my crawler returned a different output than what would have been gotten
using feedtools internals.

Very possible.

Cheers,
Bob A.