Forum: Ruby how come i cant grab all rss items from a feed

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Adam A. (Guest)
on 2009-04-04 14:28
When i try and access the bbc feed it will only return the latest 40
results. However if i use the same feed in google reader, it can return
a lot more. Why is this and how do i modify the code below so that i can
return more results.

Heres some code

require 'rss/1.0'
require 'rss/2.0'
require 'open-uri'
require 'rss/parser'


source =
"http://newsrss.bbc.co.uk/rss/newsonline_world_edit...
# url or local file
content = "" # raw content of rss feed will be loaded here
open(source) do |s| content = s.read end
rss = RSS::Parser.parse(content, false)

(rss.items.length == 40ish)
Ben L. (Guest)
on 2009-04-04 15:20
(Received via mailing list)
I'd suggest you check which headers your browser is sending.
Particularly
the Last-Modified and Etag. Replicate those and you will see the same
amount
of items.
Ben
Adam A. (Guest)
on 2009-04-04 18:37
hi ben, thanks for the reply. Im a bit confused though.

When i type the address into the browser it shows 40 items just as the
code above does.

When i access the feed via google reader it shows more. I wondering what
magic google is using to get more items than my browser or my rss code
can.
Alexey B. (Guest)
on 2009-04-04 19:05
Adam A. wrote:
> When i access the feed via google reader it shows more. I wondering what
> magic google is using to get more items than my browser or my rss code
> can.

I think this could be because google caches items internally, from the
moment someone subscribes to the feed for the first time. Then it just
periodically updates the feed and caches new items. To simulate google
behavior you would need to go back in time and start fetching items from
the moment you need.

Unless there's some api on the feed itself, in which case you'd have to
ask owners of the feed.
Ben L. (Guest)
on 2009-04-04 20:39
(Received via mailing list)
>
>
> When i access the feed via google reader it shows more. I wondering what
> magic google is using to get more items than my browser or my rss code
> can.
>
>
Ah I have to admit I glossed over the fact you were using Google Reader.
As
Snaury said, they would certainly employ  caching at their end which
should
display results beyond the ones you are seeing.

Ben
Adam A. (Guest)
on 2009-04-05 07:17
ahhh thank you for confirming my suspicions. Customized Deloreans with
flux capacitors are a bit hard to come by these days so I guess ill have
to settle with 40 a day.

Thank you for your help!
Ben L. (Guest)
on 2009-04-05 21:25
(Received via mailing list)
If it is utterly important you _could_ possibly use the google reader
api to
get at those entries. Might be a little easier than going back in time
;)
This topic is locked and can not be replied to.