Parsing large JSON files without slurping?

I have some working code using the JSON gem that works great but is very
heavy on memory and isn’t all that fast.

require ‘rubygems’
require ‘json’

json_file = File.read(‘Data.json’)
data_hash = JSON.parse(json_file)
data_hash.to_json
docs=data_hash.length
(1…docs).each { |docnum|

puts JSON.pretty_generate(data_hash[docnum])
}

But when dealing with 500 MB files, this is a bit painful so I’d like to
do something like this instead:

InputFile=File.open(json_file)

IO.foreach(InputFile) do |line|
doc=JSON.parse(line)
doc.to_json
puts JSON.pretty_generate(doc)
end

Problem is this doesn’t work and I’m not sure why other than this syntax
just may not be compatible with ruby’s JSON capability.

Can someone point me in the right direction or am I stuck slurping files
into RAM?

Austin

I think the ‘oj’ gem has the functionality that you are looking for

http://www.ohler.com/oj/

Cameron

Hi Cameron,
Thanks for responding. I ended up using YAJI, incidentally developed
by a Couchbase developer that somehow I overlooked. I have attached my
program as well as the blog post which describes the task.

http://couchbaseallday.blogspot.com/2014/11/bulk-loading-documents-in-couchbase.html

I hope it’s useful!
Austin