Forum: Ruby faster_csv vs File+split, why it is not faster?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Pablo Q. (Guest)
on 2008-11-21 19:59
(Received via mailing list)
Hi folks,

Why I'm getting this result?  is It due just to this specif problem?

the file has 293858 record, here is some record samples:


"MARCOS, LUIS","547 N LAKE ST","","MUNDELEIN","IL","000000000"
"BALDWIN, T & S","4732 NE 203RD ST","","LAKE FOREST
PARK","WA","000000000"
"RYBOLT, C","401 CEDAR DR","","CLINTON","IL","000000000"
"WELDT, KRISTINA","1945 N ORLEANS ST","","MCHENRY","IL","000000000"
.....

CODE

require 'benchmark'

Benchmark.bm do |x|
  x.report do
    FasterCSV.foreach("data_test/match.csv") do |row|
    end
  end
end

Benchmark.bm do |x|
  x.report do
    File.new("data_test/match.csv",'r').each{|line|
       row = line.split("\",\"",-1)
       row[0].gsub!('"','')
       row[a.length-1].gsub!('"','')
    }
  end
end


RESULTS

      user     system      total        real
 16.180000   0.740000  16.920000 ( *17.246190*)
      user     system      total        real
  5.830000   0.120000   5.950000 (  *6.028469*)

is this true?
James G. (Guest)
on 2008-11-21 20:12
(Received via mailing list)
On Nov 21, 2008, at 11:55 AM, Pablo Q. wrote:

> RESULTS
>
>      user     system      total        real
> 16.180000   0.740000  16.920000 ( *17.246190*)
>      user     system      total        real
>  5.830000   0.120000   5.950000 (  *6.028469*)
>
> is this true?

Is it true that File.split() is faster than FasterCSV?  Yeah, I bet it
is.  Likely reasons are:

* It's written in C
* It doesn't handle all types of CSV data, so it has less work to do

To give some examples, you split code doesn't parse this valid CSV data:

   no,quotes

Or this:

   "embedded
newlines"

Hope that explains things a bit.

James Edward G. II
Pablo Q. (Guest)
on 2008-11-21 20:28
(Received via mailing list)
I thought so...

I'm just  comparing a single case of FasterCSV to all the implementation
of
the library.

Thank you for your time!


2008/11/21 James G. <removed_email_address@domain.invalid>
This topic is locked and can not be replied to.