Out of memory generating huge csv from active record

I am trying to work with generating really large CSV files from
active record. (This is actually an end case test however) I am trying
this with
a set of active records that is 700000 records which is just a test
case that I have, though it is very large. The type of find() below is
supposed to work in pages and not have all active records in memory. I
get an out of memory error (see stack dump below). The print of the
count also never comes out.

I got that info from here in the first part of “nailing down the root
cause”

I am also using faster_csv which is supposed to not use as much memory
from soem other google searches.

I am using jruby 1.6.7 (ruby 1.8.7) and active record 2.3 (which I
assume corresponds to rails 2,.3)

on the call here, attr =
{:conditions=>[“cdr_guid_id = :cdr_guid_id”, {:cdr_guid_id=>30}]}

def self.export_to_csv(attr)
self.set_client(attr[:client])
# client is our own thing,
# and not part of active record find()
attr.delete(:client)

cnt = 0
if hrec = self.find(:first)
  CSV.generate(path) do |ofil|
    ofil << hrec.visible_attributes.keys
    self.find(:all, attr).each do |rec|
      cnt += 1
      puts cnt.to_s if cnt % 100 == 0
      ofil << rec.visible_attributes.keys.map{|col|

rec.send(col) }
end
end
end
end

===================

[2012-05-07 17:06:35] ERROR Java::JavaLang::OutOfMemoryError: Java
heap space

java.nio.channels.spi.AbstractInterruptibleChannel.begin(Unknown
Source)

May 7, 2012 5:06:35 PM com.microsoft.sqlserver.jdbc.TDSParser
throwUnexpectedTok
enException
SEVERE: ConnectionID:2: getNextResult: Encountered unexpected unknown
token (0x0
)
May 7, 2012 5:06:35 PM com.microsoft.sqlserver.jdbc.TDSReader
throwInvalidTDS
SEVERE: ConnectionID:2 got unexpected value in TDS response at offset:
0
ActiveRecord::StatementInvalid - Java::JavaLang::OutOfMemoryError:
Java heap spa
ce: SELECT * FROM record_set_3764 WHERE (cdr_guid_id = 30) :
C:/Users/lgu/vendor/activerecord-2.3.8/lib/active_record/
connection_adapters/abst
ract_adapter.rb:221:in log' C:/Users/lgu/vendor/activerecord-jdbc-adapter-0.9.7-java/lib/ active_record/connec tion_adapters/jdbc_adapter.rb:655:in select’
C:/Users/lgu/vendor/activerecord-jdbc-adapter-0.9.7-java/lib/
active_record/connec
tion_adapters/jdbc_adapter.rb:567:in jdbc_select_all' C:/Users/lgu/vendor/activerecord-2.3.8/lib/active_record/ connection_adapters/abst ract/query_cache.rb:62:in select_all_with_query_cache’
C:/Users/lgu/vendor/activerecord-2.3.8/lib/active_record/base.rb:
664:in find_by_ sql' C:/Users/lgu/vendor/activerecord-2.3.8/lib/active_record/base.rb: 1578:in find_ev
ery’
C:/Users/lgu//vendor/activerecord-2.3.8/lib/active_record/base.rb:
618:in find' ./recordset_models.rb:465:in export_to_csv’
c:/jruby-1.6.7/lib/ruby/1.8/csv.rb:330:in open_writer' c:/jruby-1.6.7/lib/ruby/1.8/csv.rb:677:in generate’
c:/jruby-1.6.7/lib/ruby/1.8/csv.rb:329:in open_writer' c:/jruby-1.6.7/lib/ruby/1.8/csv.rb:111:in generate’
./recordset_models.rb:459:in export_to_csv' ./recordset_models.rb:368:in export_to_csv’
recordset.rb:72:in __file__' org/jruby/RubyProc.java:270:in call’
org/jruby/RubyMethod.java:129:in call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:1151:in compile!’
org/jruby/RubyKernel.java:2045:in instance_eval' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:724:in route_eval’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:708:in
route!' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:758:in process_route’
org/jruby/RubyKernel.java:1183:in catch' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:755:in process_route’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:707:in
route!' org/jruby/RubyArray.java:1615:in each’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:706:in
route!' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:843:in dispatch!’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:644:in
call!' org/jruby/RubyKernel.java:2045:in instance_eval’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:808:in
invoke' org/jruby/RubyKernel.java:1183:in catch’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:808:in
invoke' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:644:in call!’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:629:in
call' C:/Users/lgu/nakajima-rack-flash-0.1.0/lib/rack/flash.rb:154:in call’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/session/
abstract/id.r
b:195:in context' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/session/ abstract/id.r b:190:in call’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/head.rb:
9:in call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/ commonlogger.rb:20:in call’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
showexceptions.
rb:21:in call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/ methodoverride.rb:24: in call’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:1272:in
call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:1303:in synchronize’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:1272:in
call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/handler/ webrick.rb:59 :in service’
c:/jruby-1.6.7/lib/ruby/1.8/webrick/httpserver.rb:104:in service' c:/jruby-1.6.7/lib/ruby/1.8/webrick/httpserver.rb:65:in run’
c:/jruby-1.6.7/lib/ruby/1.8/webrick/server.rb:173:in start_thread' org/jruby/RubyProc.java:270:in call’
org/jruby/RubyProc.java:224:in `call’
127.0.0.1 - - [07/May/2012:17:05:20 EDT] “GET /csv_file/3764/
e09d5eeee791ac74ff3
25e2163d1712226e466e3.csv HTTP/1.1” 500 120496
http://localhost:4567/ → /csv_file/3764/
e09d5eeee791ac74ff325e2163d1712226e466e
3.csv

On Monday, May 7, 2012 6:10:43 PM UTC-3, Jedrin wrote:

No, your code will load all records:

find(:all).each does that.

Either you change to find_in_batches or find_each

See documentation:

http://api.rubyonrails.org/classes/ActiveRecord/Batches.html


Luis L.

apparently I did not want the each():

self.find_in_batches(attr).each do |recs|

but rather

self.find_in_batches(attr) do |recs|

Thanks, I have tried the find_each and find_in_batches. I seem to get
a local jump error, yield out of block. Why would that be ?

The approach here is to open the csv for appending after the first
batch, but it doesn’t work thus far due to the error.

def self.export_to_csv(attr)
self.set_client(attr[:client])
# client is our own thing,
# and not part of active record find()
attr.delete(:client)

cnt = 0
path = attr[:path]
attr.delete(:path)

if hrec = self.find(:first)
  mode = 'w'
  self.find_in_batches(attr).each do |recs|
    CSV.open(path,mode) do |ofil|
    # CSV.open(path, 'w') do |ofil|
    # ofil << rec.class.column_names
      p attr
      ofil << hrec.visible_attributes.keys
      cnt += 1
      puts cnt.to_s if cnt % 100 == 0
      recs.each do |rec|
      # ofil << rec.class.column_names.map{|col| rec.send(col) }
        ofil << rec.visible_attributes.keys.map{|col|

rec.send(col) }
end
end
mode = ‘a’
end
end
end

LocalJumpError - yield called out of block:
C:/Users/lgu/tool
s/recordset/vendor/activerecord-2.3.8/lib/active_record/batches.rb:
66:in find_i n_batches' ./recordset_models.rb:461:in export_to_csv’
./recordset_models.rb:368:in export_to_csv' recordset.rb:72:in file
org/jruby/RubyProc.java:270:in call' org/jruby/RubyMethod.java:129:in call’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:1151:in
compile!' org/jruby/RubyKernel.java:2045:in instance_eval’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:724:in
route_eval' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:708:in route!’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:758:in
process_route' org/jruby/RubyKernel.java:1183:in catch’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:755:in
process_route' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:707:in route!’
org/jruby/RubyArray.java:1615:in each' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:706:in route!’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:843:in
dispatch!' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:644:in call!’
org/jruby/RubyKernel.java:2045:in instance_eval' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:808:in invoke’
org/jruby/RubyKernel.java:1183:in catch' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:808:in invoke’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:644:in
call!' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:629:in call’
C:/Users/lguild/lguild_BED-L-LGUILD_361/lguild_BED-L-LGUILD_361/TAAS/
Trunk/tool
s/recordset/vendor/nakajima-rack-flash-0.1.0/lib/rack/flash.rb:154:in
call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/session/ abstract/id.r b:195:in context’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/session/
abstract/id.r
b:190:in call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/head.rb: 9:in call’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/
commonlogger.rb:20:in
call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ showexceptions. rb:21:in call’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/
methodoverride.rb:24:
in call' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:1272:in call’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/
base.rb:1303:in
synchronize' c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/ base.rb:1272:in call’
c:/jruby-1.6.7/lib/ruby/gems/1.8/gems/rack-1.3.1/lib/rack/handler/
webrick.rb:59
:in service' c:/jruby-1.6.7/lib/ruby/1.8/webrick/httpserver.rb:104:in service’
c:/jruby-1.6.7/lib/ruby/1.8/webrick/httpserver.rb:65:in run' c:/jruby-1.6.7/lib/ruby/1.8/webrick/server.rb:173:in start_thread’
org/jruby/RubyProc.java:270:in call' org/jruby/RubyProc.java:224:in call’
127.0.0.1 - - [08/May/2012:10:37:07 EDT] “GET /csv_file/3764/
e09d5eeee791ac74ff3
25e2163d1712226e466e3.csv HTTP/1.1” 500 91568
http://localhost:4567/ → /csv_file/3764/
e09d5eeee791ac74ff325e2163d1712226e466e
3.csv