Memory leak problem


#1

I have a txt file with some data that i need to import to de database.
I’m using ruby to import those data but i have a major problem, when i
've
a large amount of data i run out of memory.

File.open("#{RAILS_ROOT}/public/files/Neighborsville.TXT").each() do
|line|
@stringArray = line.split("|")
@i += 1
puts @i
@pid = @stringArray[0]
@chain_id = @stringArray[1]
@business = Business.find_by_pid_and_chain_id(@pid,@chain_id);
#Check PID + CHAIN_ID
@business.pid = @stringArray[0]
@business.chain_id = @stringArray[1]
@business.cityname = @stringArray[17]
@business.state = @stringArray[18]
@business.business =
Business.find_by_pid_and_chain_id(@pid,@chain_id);
@business.city = City.new
@business.business_category = get_category_id(@stringArray[40])
@business.address = @stringArray[8] +" “+ @stringArray[9] +” “+
@stringArray[10]+” “+ @stringArray[11] +” “+@stringArray[12]+”
“+@stringArray[13]+” "+@stringArray[14]
if @chain_id == nil
@chain_id = “”
end
business.save
end
end

I belive that ruby use in every cycle of the do new blocks of memories
por my instances
of Business. Can someone help me please?

Thanks,

Elioncho


#2

Elias O. wrote:

I have a txt file with some data that i need to import to de database.
I’m using ruby to import those data but i have a major problem, when i
've
a large amount of data i run out of memory.

File.open("#{RAILS_ROOT}/public/files/Neighborsville.TXT").each() do
|line|
@stringArray = line.split("|")
@i += 1
puts @i
@pid = @stringArray[0]
@chain_id = @stringArray[1]
@business = Business.find_by_pid_and_chain_id(@pid,@chain_id);
#Check PID + CHAIN_ID
@business.pid = @stringArray[0]
@business.chain_id = @stringArray[1]
@business.cityname = @stringArray[17]
@business.state = @stringArray[18]
@business.business =
Business.find_by_pid_and_chain_id(@pid,@chain_id);
@business.city = City.new
@business.business_category = get_category_id(@stringArray[40])
@business.address = @stringArray[8] +" “+ @stringArray[9] +” “+
@stringArray[10]+” “+ @stringArray[11] +” “+@stringArray[12]+”
“+@stringArray[13]+” "+@stringArray[14]
if @chain_id == nil
@chain_id = “”
end
business.save
end
end

I belive that ruby use in every cycle of the do new blocks of memories
for my instances
of Business.

Yes that’s right. You’ve possibly run into an infamous “ruby’s GC is
broken!” bug. Then again, I could be wrong.
Question:
don’t you want @business.save?

Anyway ways to avoid this:
you may be able to use ar extensions, which allows for multiple inserts
into the DB. Oh wait, except that you are doing multiple updates. Never
mind.
In that case, as gross as it seems, you could try forking once per loop
[or once every x lines of the input file]–that way the forked process
will die [with its high RAM consumption] allowing the parent process to
continue [and fork more].

require ‘forkoff’
File.open("#{RAILS_ROOT}/public/files/Neighborsville.TXT").each() do
|line|
[1].forkoff {
# do your stuff
}
end

Maybe it won’t help. I know with huge data sets it helps me avoid the
GC. I’ve never tried it with sql and rails.
Good luck.
-=R


#3

On Oct 9, 3:39 am, Elias O. removed_email_address@domain.invalid wrote:

  @pid = @stringArray[0]
  @business.business_category = get_category_id(@stringArray[40])

I belive that ruby use in every cycle of the do new blocks of memories
por my instances
of Business. Can someone help me please?

Thanks,

Elioncho

Posted viahttp://www.ruby-forum.com/.

I would suggest that you don’t use member variables in this situation
if it is possible (ie ‘business’ rather than ‘@business’, ‘i’ rather
then ‘@i’). It doesn’t look like you need member variables here and it
could will mean that the GC won’t collect some memory that it
otherwise could.

You could also try a patched version of ruby with better memory
collection (http://lloydforge.org/projects/ruby/,
http://blog.pluron.com/2008/01/ruby-on-rails-i/comments/page/2/ or
http://www.rubyenterpriseedition.com/)

Also try explicitly calling GC.start at the beginning of every loop
(or every few loops). This will slow your code down a lot but I’ve
occasionally seen cases where it has helped.

All that said, the Ruby GC is a bit crap and you might just have use
the fork approach suggested by Roger.

Dan