shinji
September 12, 2007, 4:05pm
1
I have some CSV data which looks like the following (it is the output
from the RMTrack defect management tool):
Column titles:
Issue # Date & Time Opened Summary Created by User Assigned To
Resolution Date & Time Closed
example data:
1074 16/05/2006 Something is broken import bob Ignore 26/03/2007
1807 17/07/2006 Another thing doesn’t work rsmith hmaguire Ignore
27/03/2007
Basically, I’m finding the CSV documentation
http://www.ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html
of very little help…
What is the best way of going about parsing the data to get the row
count?
From that I feel I could work out the unique item counts I’m looking
for.
I was thinking about something like this…
rowcount = 0
CSV::Reader.parse(filehandle) do |row|
rowcount =+ 1
return rowcount
end
but I’m getting tangled up here.
shinji
September 12, 2007, 4:58pm
2
Column titles:
of very little help…
CSV::Reader.parse(filehandle) do |row|
rowcount =+ 1
return rowcount
end
but I’m getting tangled up here.
Posted via http://www.ruby-forum.com/ .
If all you want to do is count rows in a CSV file, you’re just counting
lines in a file and don’t need the CSV library. That’s as easy as:
$ cat test.txt
1
2
3
4
5
6
$ irb
irb(main):001:0> File.open(‘test.txt’, ‘r’) do |file|
irb(main):002:1* lines = 0
irb(main):003:1> file.each_line do |line|
irb(main):004:2* lines += 1
irb(main):005:2> end
irb(main):006:1> puts “Total lines: #{lines}”
irb(main):007:1> end
Total lines: 6
=> nil
If you do want to do further operations on each row that do require the
row
to be parsed into its fields, you can use CVS like so:
irb(main):008:0> require ‘csv’
=> true
irb(main):008:0> lines = 0
=> 0
irb(main):010:0> CSV.open(‘test.txt’, ‘r’) do |row|
irb(main):011:1* lines += 1
irb(main):012:1> # do something else that requires the CSV library
irb(main):013:1* end
=> nil
irb(main):014:0> puts “Total lines: #{lines}”
Total lines: 6
=> nil
Hope that helps,
Felix
shinji
September 12, 2007, 5:11pm
3
On Sep 12, 2007, at 9:57 AM, Felix W. wrote:
If all you want to do is count rows in a CSV file, you’re just
counting
lines in a file and don’t need the CSV library.
CSV fields can contain line ending characters which could throw of
your counts. Use a CSV parser unless you are sure about the data
content.
James Edward G. II
shinji
September 12, 2007, 5:17pm
4
On Sep 12, 9:05 am, Max R. [email protected] wrote:
27/03/2007
CSV stands for comma-separated values. Where are the commas?
To get the number of lines in a file:
IO.readlines(‘my_file’).size
shinji
September 12, 2007, 8:16pm
5
On 12.09.2007 16:05, Max R. wrote:
27/03/2007
I was thinking about something like this…
rowcount = 0
CSV::Reader.parse(filehandle) do |row|
rowcount =+ 1
return rowcount
end
but I’m getting tangled up here.
You have the return statement in the wrong place. If you just want to
count lines in a file then you can just do "wc -l ". If you want
to do it in Ruby you can do "ruby -ne ‘END{puts $.}’ ". If you
want to do it inside a script, an efficient variant is this:
count=File.open(f){|io| c=0;io.each { c+=1 }; c}
Kind regards
robert
shinji
September 12, 2007, 5:38pm
6
James G. wrote:
On Sep 12, 2007, at 9:57 AM, Felix W. wrote:
If all you want to do is count rows in a CSV file, you’re just
counting
lines in a file and don’t need the CSV library.
CSV fields can contain line ending characters which could throw of
your counts. Use a CSV parser unless you are sure about the data
content.
James Edward G. II
def parsecv(filehandle)
csvarrays = CSV.read(filehandle)
numrows = (csvarrays.length) - 1
return numrows
end
I ended up using a CSV.read method because it returns an array of
arrays. This makes it quite easy to lop of the titles row returning a
row count.
Now to start getting the count of individual terms from a column…