Hi What is the correct way to code the following in Ruby? open file read field from file determine from field the type of file case file type in type 1) do this (lots of stuff) type 2) do that (even more stuff) type 3) do something else (you get the picture) *) error esac close file I presently have this coded as one big method but the number of cases are large and the complexity of the processing increasing. This is going to end up a hell of a larger method. I then thought about breaking it into a number of methods but that resulted in opening and closing the file a number of times (in each method) which feels bad Would it be right to use a global variable as a file descriptor and open the file in one method which returns a fd and then having a number of methods processing using the fd and then finally having a method that closes the file? Sorry if I'm seem to be grasping for the correct Ruby language to express this - I'm very new to Ruby coming from a C and shell background. Daveh
on 2007-01-30 19:28
on 2007-01-30 19:35
Recently coded something similar. I just slurp the file into an array (File.readlines) and then split the array into separate arrays of each record type. cheers Chris
on 2007-01-30 19:40
Hi Chris, > I just slurp the file into an array (File.readlines) and then > split the array into separate arrays of each record type. I guess I should have mentioned that these files are binary format and can be quite large - anywhere from 3M to 50MB, so I was hoping to process them on disk. Would reading into an array still be appropriate in Ruby?
on 2007-01-30 19:49
On 1/30/07, Dave H. <email@example.com> wrote: > If you are on a Unix system, look into the Ruby mmap library. <http://raa.ruby-lang.org/project/mmap/0.2.6> This will only read in the parts of the file you need. There is a similar library for the windows platform (but I have not used that one). My solution to handling stuff like this is to create a parser class. class FooParser def initialize( filename ) @mmap = Mmap.new(filename, 'r') end def close return if @mmap.nil? @mmap.unmap @mmap = nil end def parse_info_type1 # do stuff here to parse one type of information from the mmap object end def parse_info_type2 # etc ... end end It works very well, and mmap allows me to handle gigabyte sized files without hogging all the system memory. Blessings, TwP
on 2007-01-30 19:49
Dave H. wrote: > What is the correct way to code the following in Ruby? > > open file > read field from file > determine from field the type of file > case file type in > type 1) do this (lots of stuff) > type 2) do that (even more stuff) > type 3) do something else (you get the picture) > *) error > esac > close file You shouldn't need to open/close the file more than once. Unless I'm misunderstanding something, you should be able to just do something like this: File.open('huge.txt') do |file| first_line = file.gets file_type = determine_file_type(first_line) case file_type when 'csv': process_csv(file) when 'tab_delimited': process_tab(file) else puts "Error! OMG!" end end def process_csv(file) file.each do |line| # do something with the line end end The main File.open block handles the opening and closing of the file, and you just pass the file handle to the methods that do the actual processing. Nice and simple.