Rails bug ? metadata lost between page invocation?

Ok, so I posted this on the ‘ruby on rails’ newsgroup
(http://groups.google.com/group/rubyonrails/browse_frm/thread/cfce770d3fbfbd1f/a51aad47e46e2adf#a51aad47e46e2adf)
but didn’t get very far.

Hopefully this community will be able to help or let me know whether
this is a genuine rails bug. This is a lengthy post, but please bear
with me.

So, I am trying to cache some data in memory, accross page invocation. I
am using a simple class ‘MyCache’, which simply holds a class variable
‘@@saved_obj’. A problem arises when you try to put a model object (an
ActiveRecord object) in @@saved_obj. If your model has ANY method
(besides those in ActiveRecord::Base) , the cached object will NOT have
these methods when you get the object back from the cache. Yes, that’s
true. It looks like the metadata for the cached object was reset between
invocation.

(if you have read my previous post on the ‘ruby on rails’ newsgroup, you
will notice that rails reloads model and controller classes between
invocation, so that’s why the class variable has to be in a separate
class, one that won’t be reloaded between invocation)

Let me simply give a code example to show what I mean:

Here is the SimpleObj model (notice the new method ‘extra_attr’):
# file app/models/simple_obj
class SimpleObj < ActiveRecord::Base
def extra_attr
“hello”
end
end

Here is the controller
# file app/controllers/my_test_controller.rb
class MyTestController < ApplicationController
def put
MyCache.put
end

  def get
    MyCache.get
  end
end

Here is the cache object:
# file app/my_cache.rb
class MyCache
@@saved_obj = nil

  def self.put
    @@saved_obj = SimpleObj.new
    p @@saved_obj.extra_attr
  end

  def self.get
    p @@saved_obj.extra_attr
  end
end

To recreate a full setup:
rails test
cd test
./script/generate model SimpleObj
./script/generate controller MyTest
Create a simple db with a table ‘simple_objs’. No need to put anything
in the table db… My sqlite schema is:
drop table simple_objs;
create table simple_objs (
id integer primary key autoincrement
);
Again, the schema doesn’t matter, but it needs to be there.

That’s it.

Now start your app: ./script/server
and point your bowser to:
http://localhost:3000/my_test/put
(forget about the missing view for ‘put’)
then
http://localhost:3000/my_test/get
(undefined method `extra_attr’ exception)

Ok, so what happening is:
put: after accessing my_test/put, a new SimpleObj is stored in
@@saved_obj. The value of ‘extra_attr’ is printed in the
logs/development.log file (“hello”). Perfect.

get: after accessing my_test/get, the value is retrieve from
@@saved_obj. Now, for some reason, this time the @@saved_obj does NOT
have the method ‘extra_attr’ !!! HOW BIZARRE.

If you simply print @@saved_obj, you will see that the object is indeed
of type ‘SimpleObj’ but for some reasons, it doesn’t seem to have the
method ‘extra attr’.

I know this is a lengthy post… I tried my best to summarize everything
and isolate the problem. Thanks for reading and let me know if this
makes any sense to you.

-Didier

Let me just add one more thing (like my first post was not long enough
already :slight_smile:

Two more thing I found interesting:

1 - of course, if you set ‘config.cache_classes = true’ in your
config/environments/development.rb, then everything works fine (I forgot
to mention that before)

2 - in the MyCache class, if you put a ‘p @@saved_obj.hash’ in the
‘self.get’ method, you end up with a ‘stack level too deep’ exception !!
(and of course, it works fine in the ‘self.put’ method…) This is
INSANE :slight_smile:

-Didier

What you are seeing is correct behavior for Rails.

What you need to persist objects between requests is the Session object.
More info can be found here:
http://wiki.rubyonrails.com/rails/pages/UnderstandingSessions
http://wiki.rubyonrails.com/rails/pages/HowtoWorkWithSessions

Rails uses a ‘Shared Nothing’ pattern, and in this pattern, global
variables
are useless for any data you need between requests. In a production
environment (i.e. not WEBrick), your requests will be handled in
separate
processes, so global variables will never work.

In addition, the reload behavior you are seeing is correct. As you’ve
seen,
this is configurable, and is turned on for development (so you don’t
have to
restart your web server), and turned off for production (for
performance).

You could use Fragment caching to store this information, keyed on your
stock symbol (or whatever makes your data unique). Note that caching is
turned off, by default, in development mode, but you can turn it on in
your
development.rb file to see it work.

http://rails.rubyonrails.com/classes/ActionController/Caching/Fragments.html

If you want a global object cache, then Memcache will also work (you can
use
Memcache as your fragment cache too).

http://wiki.rubyonrails.com/rails/pages/MemCached

For Rails apps, lighttpd+FCGI will generate processes, not threads, to
handle requests. This is key. Whenever you think about a Rails app,
forget
that threads even exist, it makes life a lot simpler.

You may have found a bug, but it doesn’t matter as much because there is
a
better way to do what you want - Fragment caching.

Tom, first, thanks for your reply.

What you are seeing is correct behavior for Rails.

hum… let me address this point later.

What you need to persist objects between requests is the Session object.
More info can be found here:
Peak Obsession
Peak Obsession

Tom, I do understand sessions but the reason I can’t use them here is
that I want a global cache, not a user level cache. As a simple example,
think about a simple stock market app. Once a user has looked at a given
stock, you want to cache the data so that other requests to look at the
same stock (by ANY other user) will not have to go to the database…

Again, that’s simple example, but that’s the idea behind the cache I am
trying to implement (get something from db, perform some simple
calculation, and cache it so that next user/request for the same data
will not have to go to the db and perform the same calculations
again…)

Rails uses a ‘Shared Nothing’ pattern, and in this pattern, global
variables are useless for any data you need between requests. In a production
environment (i.e. not WEBrick), your requests will be handled in separate
processes, so global variables will never work.

Hum… lighttp/fcgi uses threads, but that’s not my point.
And even if I end up deploying in a multi server environment, that’s
fine. Eventually each server will built a cache of most recently
accessed stocks which is exactly what I am trying to achieve here.

In addition, the reload behavior you are seeing is correct. As you’ve
seen, this is configurable, and is turned on for development (so you don’t
have to restart your web server), and turned off for production (for
performance).

I suspect there is a real issue here. I understand that the
models/controllers are reloaded, but how do you explain that when I get
my @@saved_obj back (in the ‘MyCache::get’ method in my example), it has
the right class, the right member attribute (at least those that were
part of my db schema), but any additional method I added (like
“extra_param” in my example) are gone !!

And I am not even mentionning the ‘stack level too deep’ exception I get
when doing a @@saved_obj.hash !!

Wouldn’t you agree this seems like an issue ?

-Didier

Tom F. wrote:

You could use Fragment caching to store this information, keyed on your
stock symbol (or whatever makes your data unique). Note that caching is
turned off, by default, in development mode, but you can turn it on in
your development.rb file to see it work.

Peak Obsession

That’s a great idea. Especially since it is already included in rails…
The only problem is that you can only cache strings, not rails objects
(sure, sure, I can always marshal everything into a string…)

If you want a global object cache, then Memcache will also work (you can
use Memcache as your fragment cache too).

Peak Obsession

True, but it looks like an overkill here. My goal was very simple
initially :slight_smile: But thanks…

For Rails apps, lighttpd+FCGI will generate processes, not threads, to
handle requests. This is key. Whenever you think about a Rails app,
forget that threads even exist, it makes life a lot simpler.

Correct. I was wrong about fcgi.

The mindshift is not that easy… I mean, if somebody asks me to
implement a connection pool, the first thing I’ll think of is a static
array of connections (actually I would probably do that for any quick
cache)

I guess you have to start rethinking these once you switch to a
process-based dispatcher… hum…

You may have found a bug, but it doesn’t matter as much because there is
a better way to do what you want - Fragment caching.

Ok, so I guess I am left with either keeping everything in the db or
using fragments and marshalling everything. I guess I’ll try to use
marshalling here… I actually just tried it… It seems to work fine.

Thanks for your help.

-dda

For those remotely interested in my solution (thanks to Tom), here it
is:

The cache object now uses fragment_cache_store to store a marshelled
version of the object to save:
# file app/my_cache.rb
class MyCache
def self.put
saved_obj = SimpleObj.new
ActionController::Base.fragment_cache_store.write(‘saved_obj’,
Marshal.dump(saved_obj))
end

  def self.get
    saved_obj = 

Marshal.load(ActionController::Base.fragment_cache_store.read(‘saved_obj’))
p saved_obj.extra_attr
end
end

In the controller, I am just adding
model :simple_obj
(otherwise there are some class loading issues)

In config/environment.rb, I uncommented the line:
config.action_controller.fragment_cache_store = :file_store,
“#{RAILS_ROOT}/cache”

I created $RAILS_ROOT/cache

In config/environments/development.rb, I chanded the caching option:
config.action_controller.perform_caching = true

That’s it… And now it works.
Thanks a lot for your help Tom.

-Didier