FasterCSV 0.1.3--CSV parsing without the wait!

FasterCSV 0.1.3 Released

FasterCSV grew out of my involvement in a discussion on the Ruby Core
mailing list. The heart of this new CSV parser is a single, hyper-
optimized Regular Expression from “Mastering Regular Expressions,
Second Edition,” further enhanced by myself.

This library would not exist if it weren’t for Ara.T.Howard’s
tireless attempts to break it. :wink:
I’m very grateful for all the help and insight Ara gave that made a
robust solution possible. Thank you!

If you use this library, don’t be shy with the feedback! Tell me
what’s working and what’s not:

[email protected]

What is FasterCSV?

(from the README)

FasterCSV is intended as a replacement to Ruby’s standard CSV
library. It was designed to address concerns users of that library
had and it has three primary goals:

  1. Be significantly faster than CSV while remaining a pure Ruby
    library.
    • There are unit tests ensuring this very thing. (Thanks for
      the idea Rob!)

    • It really is faster. I mean it:
      $ rake benchmark
      (in /Users/james/Documents/Ruby/faster_csv)
      time ruby -r csv -e ‘CSV.foreach(“test/test_data.csv”) { |
      row| }’

      real 0m25.758s
      user 0m25.652s
      sys 0m0.076s
      time ruby -r lib/faster_csv -e ‘FasterCSV.foreach(“test/
      test_data.csv”) { |row| }’

      real 0m2.902s
      user 0m2.866s
      sys 0m0.030s

  2. Use a smaller and easier to maintain code base.
    • I still want it to include all the CSV functionality you need
      though!
      (Do tell me if you find something missing.)
    • …And nice shiny new features!
    • Again, it really is smaller:
      $ rake stats
      (in /Users/james/Documents/Ruby/faster_csv)
      ±---------------------±------±------±--------±--------
      ±----±------+
      | Name | Lines | LOC | Classes | Methods |
      M/C | LOC/M |
      ±---------------------±------±------±--------±--------
      ±----±------+
      | FasterCSV | 400 | 133 | 2 | 12
      | 6 | 9 |
      | Units | 439 | 334 | 5 | 23
      | 4 | 12 |
      ±---------------------±------±------±--------±--------
      ±----±------+
      | Total | 839 | 467 | 7 | 35
      | 5 | 11 |
      ±---------------------±------±------±--------±--------
      ±----±------+
      Code LOC: 133 Test LOC: 334 Code to Test Ratio: 1:2.5
  3. Improve on the CSV interface.
    • My opinion, of course.

Migrating from CSV to FasterCSV?

The README includes a section on the differences and you can read
that here:

http://fastercsv.rubyforge.org/

You call also see general usage in the documentation of the
interface, right here:

http://fastercsv.rubyforge.org/classes/FasterCSV.html

If FasterCSV isn’t meeting your needs, I want to here about it:

[email protected]

Coming soon!

(from the TODO)

  • Add support for accessing fields by headers (from first row of
    document).
    (I promise Ara, it’s next on the list!)
  • Add “convertors” for switching numbers to Integers or Floats, dates
    to Date or
    Time objects, etc.

Where can I learn more?

FasterCSV is hosted on RubyForge.

Project page: http://rubyforge.org/projects/fastercsv/
Documentation: http://fastercsv.rubyforge.org/
Downloads: http://rubyforge.org/frs/?group_id=1102

How do I get FasterCSV?

FasterCSV is a gem, so as long as you have RubyGems installed it’s as
simple as:

$ sudo gem install fastercsv

If you need to install RubyGems, you can download it from:

http://rubyforge.org/frs/?group_id=126&release_id=2471

FasterCSV can also be installed manually. Just download the latest
release and follow the instructions in INSTALL:

http://rubyforge.org/frs/?group_id=1102&release_id=3545

James Edward G. II