Forum: Ruby on Rails Getting the most of our caches when dealing with external HTTP services

49e98c4a13a5203ecff5d2d0774f8991?d=identicon&s=25 Matthijs Langenberg (mlangenberg)
on 2011-11-21 22:56
(Received via mailing list)
Dear all,

I am dealing with a group of pages that display data from a few
different HTTP resources. In order to get these pages to be performant
you must understand that I want to cache as much as possible at the
front of my Rails application.

Another requirement is that we never show stale data to the user.
These pages aren't getting huge amounts of requests/second, but when
there are no changes I want everything to feel snappy. If it takes a
bit more time for the first requests to set up the caches it's okay.

Fortunately HTTP ships with something awesome since the 80's:
conditional HTTP request. Just send an 'If-Modified-Since' or 'If-None-
Match' header along with the request and the server returns a '304 Not
Modified' or the full response.

It is simple to use the Rails Cache Store to cache HTTP responses. But
actually, the most time consuming is building an object model from the
response and generate the HTML fragments.

Therefore I am looking for an API that allows me to conditionally
execute render code, based on the response of external HTTP requests.

[browser] --[GET /page]--> [Rails app][view][controller][model] --
[GET /resource]--> [external service]

I tried to come up with a first proposal:
https://gist.github.com/1383983

What do you guys think? Does this make any sense? Are there any other
approaches that I could try?

At least it provides a way through different layers, without leaking
knowledge. The downside is of course that every finder needs support a
block, where it normally would just return a result value.

This approach makes it impossible to cache different HTML fragments of
the same resource, but I think I can mitigate that in a followup
proposal.


Thank you for providing feedback,

Matthijs
81b61875e41eaa58887543635d556fca?d=identicon&s=25 Frederick Cheung (Guest)
on 2011-11-22 00:05
(Received via mailing list)
On Nov 21, 9:54pm, Matthijs Langenberg <mlangenb...@gmail.com> wrote:
> bit more time for the first requests to set up the caches it's okay.
>
> Fortunately HTTP ships with something awesome since the 80's:
> conditional HTTP request. Just send an 'If-Modified-Since' or 'If-None-
> Match' header along with the request and the server returns a '304 Not
> Modified' or the full response.

Nit picker's corner: first version of http was 0.9 was in 1991, and if
my reading is correct http 1.0 is the one that added if-modified-since
etc.
>
> I tried to come up with a first proposal:https://gist.github.com/1383983
>
> What do you guys think? Does this make any sense? Are there any other
> approaches that I could try?
>

Could you be using action controller's stale? / fresh_when methods ?
If so you'll probably going to want to use something like rack-cache,
varnish etc. in front of rails, since otherwise you'd only be able to
return 304 if that particular client had already requested the data
(which may or may not be a problem). If you only want to cache bits of
the page, you could also use bog standard fragment caching, using the
etag/last modified since etc. of the remote response as part of the
cache key

Fred
49e98c4a13a5203ecff5d2d0774f8991?d=identicon&s=25 Matthijs Langenberg (mlangenberg)
on 2011-11-22 13:41
(Received via mailing list)
Hi Fred,

Thanks for getting back to me.

On Nov 22, 12:04am, Frederick Cheung <frederick.che...@gmail.com>
wrote:
> > Dear all,
>
> > Fortunately HTTP ships with something awesome since the 80's:
> > conditional HTTP request. Just send an 'If-Modified-Since' or 'If-None-
> > Match' header along with the request and the server returns a '304 Not
> > Modified' or the full response.
>
> Nit picker's corner: first version of http was 0.9 was in 1991, and if
> my reading is correct http 1.0 is the one that added if-modified-since
> etc.
>
>

You are totally correct. What I was trying to say is that we often re-
invent the wheel while there are beautiful gems inside existing
standards such as HTTP that leverage a lot of functionality. I was
over exaggerating the history of HTTP.
Thanks for putting that right.

> > I tried to come up with a first proposal:https://gist.github.com/1383983
> etag/last modified since etc. of the remote response as part of the
> cache key
>

I wonder how that would look.

To be clear, I am not into sending a '403 Not Modified' to the users
browser. For my application that would not be worth the effort.
Different users view the same page once. So the second user should be
served a cached response from Rails if possible. And by cached
response I mean little HTML fragments stitched together.

The request would still go through Rails. But based on a '403 Not
Modified' from external HTTP services, it would skip parsing XML
responses and rendering expensive partials.

Am I right that in your approach, an ActiveResource finder would
attach the returned ETAG or Last-Modified date to the returned object,
so it can be used inside a view?

I agree with you that the cache key should be chosen from the Rails
view. There should be another place that stores the last fetched ETAG
for that particular resource. And it actually needs to know if there
is a cached HTML fragment for that resource.

Looks like a Catch-22 situation to me.
49e98c4a13a5203ecff5d2d0774f8991?d=identicon&s=25 Matthijs Langenberg (mlangenberg)
on 2011-11-23 22:18
(Received via mailing list)
Alright, I think I can solve the issues by having two separate caches
and implement lazy loading.

In the view layer I want to be able to do:

<% cache [@post.cache_key, 'author'] %>
 <h1><%= @post.author %></h1>
<% end %>

Hello, <%= current_user.name %>, this is not cached.

<% cache [@post.cache_key, 'body'] %>
<p><%= @post.body %></p>
<% end %>

Then the api model can look something like this:

class Api::Post
  def self.first
    etag = $cache.read('data:etag')
    response = fetch_first(etag)
    if response == :not_modified
      puts 'cache HIT'
    else
      puts 'cache MISS'
      $cache.write 'data:etag', response.first
      $cache.write 'data', response.last
    end
    new(etag)
  end

  def self.fetch_first(etag)
    if etag.nil?
      puts 'Fetch XML'
      ["1449ee0ec320e5bf5ed7a9949d4771d9", "<post>Hallo!</post>"]
    else
      :not_modified
    end
  end

  attr_reader :etag
  def initialize(etag)
    @etag = etag
    @document = nil
  end

  def body
    parse if @document.nil? # Lazy-loading
    @document.children.first.text
  end

  def parse
    @document = Nokogiri.parse($cache.read('data'))
  end
end

Two questions:

1. What can I do when the API returns a collection. I cannot return
Array.
2. How can I wrap a Domain layer around the Api::Post class.

- Matthijs
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.