[This is mostly an update on
ruby - PostgreSQL queries are gobbling up VM - Stack Overflow (thanks Fred!), but I’ll
take my answer from wherever I can get it…]
Short form: does the postgresql db adaptor cache queries? I’m passing
lots of large raw SQL queries (160k bytes each), and my private virtual
memory is growing and growing and growing…
Long form:
I’m launching external Rails tasks to fetch data from the web and store
said data into a database. These tasks grow without apparent bound, and
eventually bring my machine to its knees when its 8GB of physical RAM
fills up and it starts to swap.
I haven’t been able to track down the source of the bloat. At each
Delayed::Job#after callback, I do an explicit
ObjectSpace.each_object(ActiveRecord::Relation).each(&:reset)
GC.start
report_object_use
In report_object_use, I print out ObjectSpace.each_object(cls).count for
Object, String, Array, Hash, and a few other classes. I also print out
Symbol.all_symbols.count. NONE of those numbers seem to be growing
much.
BUT I’m beginning to wonder if the db connection is caching the queries.
In particular, I’m constructing my own SQL queries and executing them
via:
ActiveRecord::Base.connection.select_all(query)
And the queries can be really really big – about 160K bytes per
transaction, and I’m doing about a transaction per second (on a good
day). Note that if the db connection IS caching the queries, it’s
happening under the Ruby level, since
ObjectSpace.each_object(String).count isn’t increasing.
So the questions:
- Does the DB adaptor somehow cache SQL queries passed to it? Or is
there a way to find out? - Um, I guess that’s all the questions.
- ff
P.S.: Rails 1.9.3, Ruby 3.2.1, Postgresql 9.1.2, pgserver 8.3.14
P.P.S.: I know the bloat isn’t happening on the network query side since
I created a mock object to simulate fetching the data.