Gen. Row objects from tab-delim text lines

This is nothing ground-breaking, but I thought I’d share it anyway. It
was a fun little exercise, and I’ve found it pretty useful.

RowParser.parse takes a String (or will read in an IO), assumes it’s
\n- and \t-delimited, and that the first line is \t-delimited column
headers. It generates a Row class with attributes named for the
columns, and returns an array of populated Row objects. It does a bit
of name munging, and allows for empty columns.

It’s been useful during this data conversion I’m working on – both SQL
output and MS Word/Excel tables save to tab-delim text files easily.

Example: here’s a table of nutritional info for different beers (from
http://www.realbeer.com/edu/health/calories.php). I hope the tabs come
out alright – notice that some of the columns are empty:

Brewery/Brand Beer Alcohol % Calories Carbs
Amstel Light Amstel Light 3.5 95 5
Alaskan Brewing Alaskan Amber 5
Alaskan Brewing Alaskan Pale Ale 4.6
Alaskan Brewing Alaskan Stout 5.7
Alaskan Brewing Alaskan ESB 5
Alaskan Brewing Alaskan Smoked Porter 6.1
Alaskan Brewing Alaskan Winter Ale 6.2
Anchor Anchor Steam 4.9 152
Anchor Liberty Ale 6 188
Anchor Anchor Porter 5.6 205

A bit of ruby:
rp = RowParser.new
data = rp.parse(File.open(“beer.txt”))
data.each { |d|
puts “#{d.beer}, #{d.calories}”
}

The output:
Amstel Light, 95
Alaskan Amber,
Alaskan Pale Ale,
Alaskan Stout,
Alaskan ESB,
Alaskan Smoked Porter,
Alaskan Winter Ale,
Anchor Steam, 152
Liberty Ale, 188
Anchor Porter, 205

Here’s the source:

class String
def conservative_split(delim)
str = self.to_s
bits = []
while str.include? delim
bits << str.slice!(0…str.index(delim)).strip
end
bits << str
return bits
end
end

class RowParser
attr_reader :code

def parse(data)
	# If it's an IO, read it in.  Else, assume it's a String.
	data = data.read if data.kind_of? IO
	data = data.split("\n") # Lines!

	# Create the Row class, based on the header columns
	headers = data.shift.split("\t")  # First line is the column names
	headers.collect! do |h|
		h = h.slice!(0).chr.downcase + h  # Downcase the first letter...
		h.gsub(/[^\w\d_]/, '')  # ...and purge problematic punctuation.
	end

	symbs = headers.collect { |h| ":#{h}" }.join(', ') # For attr_reader
	args = headers.join(', ') # For the constructor
	attrs = headers.collect { |h| "@#{h}" }.join(', ') # For the attrs
	@code = <<CODE
		class Row
			attr_reader #{symbs}
			def initialize(#{args})
				#{attrs} = #{args}
			end
			def to_s
				[#{attrs}].join(', ')
			end
		end

CODE
eval @code

	# Now, create a Row from each remaining line
	rows = []

	data.each { |line|
		line = line.conservative_split("\t")
		rows << Row.new(*line)
	}

	return rows
end

end

Hope it’s useful…
Dan

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs