Forum: Ruby FasterCSV RCR?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
James G. (Guest)
on 2006-05-27 03:11
(Received via mailing list)
I'm considering submitting my first RCR to add FasterCSV to the
Standard Library.

It's a pretty mature library now, has a CSV compatibility mode, is
very feature rich (including many CSV lacks), and is wicked fast in
comparison.  I see it recommended regularly and get lots of positive
feedback.

What do others think?  Worth adding?

James Edward G. II
pat eyler (Guest)
on 2006-05-27 03:18
(Received via mailing list)
On 5/26/06, James Edward G. II <removed_email_address@domain.invalid> wrote:
> I'm considering submitting my first RCR to add FasterCSV to the
> Standard Library.

Sweet.

>
> It's a pretty mature library now, has a CSV compatibility mode, is
> very feature rich (including many CSV lacks), and is wicked fast in
> comparison.  I see it recommended regularly and get lots of positive
> feedback.
>
> What do others think?  Worth adding?
>

Yes, absolutely.  I brought it in house here, and we used it pretty
widely
until someone made an issue out of the fact that it's not in the
standard
library.

Since we run through fairly large CSVs multiple times a day, I enjoy the
speed FasterCVS gives us and I really don't want to have to go back.
Daniel B. (Guest)
on 2006-05-27 03:46
(Received via mailing list)
James Edward G. II wrote:
> I'm considering submitting my first RCR to add FasterCSV to the Standard
> Library.

It should *replace* the current CSV library. :)

Regards,

Dan
James G. (Guest)
on 2006-05-27 03:59
(Received via mailing list)
On May 26, 2006, at 6:46 PM, Daniel B. wrote:

> James Edward G. II wrote:
>> I'm considering submitting my first RCR to add FasterCSV to the
>> Standard Library.
>
> It should *replace* the current CSV library. :)

I assume we have to keep CSV for backwards compatibility.  We still
have ftools, even though fileutils is preferred.  runit too.

James Edward G. II
Hal F. (Guest)
on 2006-05-27 09:34
(Received via mailing list)
pat eyler wrote:
> On 5/26/06, James Edward G. II <removed_email_address@domain.invalid> wrote:
>
>> It's a pretty mature library now, has a CSV compatibility mode, is
>> very feature rich (including many CSV lacks), and is wicked fast in
>> comparison.  I see it recommended regularly and get lots of positive
>> feedback.
>>
>> What do others think?  Worth adding?
>>

I'd suggest changing the name to CSV. And possibly defaulting to
compat-mode or perhaps issuing a warning if it's detected that
the user is trying to use the Old Library.


Hal
James G. (Guest)
on 2006-05-27 19:28
(Received via mailing list)
On May 27, 2006, at 12:31 AM, Hal F. wrote:

> I'd suggest changing the name to CSV. And possibly defaulting to
> compat-mode or perhaps issuing a warning if it's detected that
> the user is trying to use the Old Library.

FasterCSV looses much of it's speed in compatibility mode.  I think
we want to encourage people to use the new interface, especially
since I think it's superior.  ;)

James Edward G. II
Yukihiro M. (Guest)
on 2006-05-27 20:23
(Received via mailing list)
Hi,

In message "Re: FasterCSV RCR?"
    on Sat, 27 May 2006 14:31:50 +0900, Hal F.
<removed_email_address@domain.invalid> writes:

|I'd suggest changing the name to CSV. And possibly defaulting to
|compat-mode or perhaps issuing a warning if it's detected that
|the user is trying to use the Old Library.

I agree.  I don't want to have two independent CSV readers in the
distribution.  It's OK that compatible mode is slow, or gives
obsoletion warning.  But we have to discuss about when it should
happen - during 1.8.x or for 1.9.

							matz.
Gregory B. (Guest)
on 2006-05-28 02:39
(Received via mailing list)
On 5/26/06, James Edward G. II <removed_email_address@domain.invalid> wrote:
> I'm considering submitting my first RCR to add FasterCSV to the
> Standard Library.

I bugged you about doing this off list, which may be why you posted
this, but just so people know, I use FasterCSV a lot in my work (and
in Ruport) and it has been very pleasant to work with! :)
James G. (Guest)
on 2006-05-28 04:35
(Received via mailing list)
On May 27, 2006, at 11:20 AM, Yukihiro M. wrote:

> I agree.  I don't want to have two independent CSV readers in the
> distribution.  It's OK that compatible mode is slow, or gives
> obsoletion warning.

Alright, let me take another crack at the compatibility mode then.  I
can probably speed in up since I know it's about to gain importance.

> But we have to discuss about when it should
> happen - during 1.8.x or for 1.9.

I trust your judgement on what is best.

I guess I should warn you that the compatibility mode is not a 100%
CSV replacement.  It works for the majority of applications using
just the CSV.* methods, but I don't even try to support all the
reader and writer object.  I've never seen code that uses those, but
it could exist.

James Edward G. II
NAKAMURA, Hiroshi (Guest)
on 2006-05-28 05:27
(Received via mailing list)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

Long time no post.

James Edward G. II wrote:
>> I agree.  I don't want to have two independent CSV readers in the
>> distribution.  It's OK that compatible mode is slow, or gives
>> obsoletion warning.
>
> Alright, let me take another crack at the compatibility mode then.  I
> can probably speed in up since I know it's about to gain importance.

Please do not waste your time any more. (sorry for writing this.  I know
you are taking much time to support users for using CSV in Ruby).
 Cracks are from difference of our CSV standpoints so it must not be
100% compatible.  Just replace csv.rb with faster_csv.rb.

Replacement (in my opiniion):
  On 1.9: replace csv.rb with faster_csv.rb.

  On 1.8: Never mind.  replace csv.rb with faster_csv.rb, with no
          compatible mode.

As a bundled library (in my opiniion):

  One thing I don't like faster_csv.rb is String#parse_csv and
  Array#to_csv.  Please do not bring pollution to standard classes.

  Kernel.CSV should be discussed well before introducing it.  Needed?
  (We already have Kernel.URI though...)

Regards,
// NaHi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iD8DBQFEePvdf6b33ts2dPkRAsDUAKCiCz135/QtR2sFJV2bNbBz0EyAiQCgxx0j
Pid8sDiayX7PhYJSsRKFK60=
=88/I
-----END PGP SIGNATURE-----
James G. (Guest)
on 2006-05-28 05:54
(Received via mailing list)
On May 27, 2006, at 8:25 PM, NAKAMURA, Hiroshi wrote:

>>> obsoletion warning.
>>
>> Alright, let me take another crack at the compatibility mode then.  I
>> can probably speed in up since I know it's about to gain importance.
>
> Please do not waste your time any more. (sorry for writing this.  I
> know
> you are taking much time to support users for using CSV in Ruby).
>  Cracks are from difference of our CSV standpoints so it must not be
> 100% compatible.

Do we have different standpoints?  I hope not too different.  We're
just using different parsing techniques, right?

Other than to_csv() and parse_csv(), are there things you don't like
about FasterCSV?  I'm open to suggestions.

> Just replace csv.rb with faster_csv.rb.

I just don't want to break a lot of software.  :(

> As a bundled library (in my opiniion):
>
>   One thing I don't like faster_csv.rb is String#parse_csv and
>   Array#to_csv.  Please do not bring pollution to standard classes.
>
>   Kernel.CSV should be discussed well before introducing it.  Needed?
>   (We already have Kernel.URI though...)

Maybe I'm alone in this thinking, but I'm not bothered by conversion
methods like this.  It's also fairly common (to_set(), to_yaml(), etc.).

James Edward G. II
NAKAMURA, Hiroshi (Guest)
on 2006-05-28 06:09
(Received via mailing list)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi James,

James Edward G. II wrote:
>> Please do not waste your time any more. (sorry for writing this.  I know
>> you are taking much time to support users for using CSV in Ruby).
>>  Cracks are from difference of our CSV standpoints so it must not be
>> 100% compatible.
>
> Do we have different standpoints?  I hope not too different.  We're just
> using different parsing techniques, right?

As you wrote in your document, followings are from standpoint I think.
 * streaming
 * record terminator handling

I don't think faster_csv is wrong.  I just wrote csv.rb from (a little)
different viewpoint 6 years ago.

> Other than to_csv() and parse_csv(), are there things you don't like
> about FasterCSV?  I'm open to suggestions.

No.  That's all for now.  (Sorry, I've not yet look into new CSV
features)

>> Just replace csv.rb with faster_csv.rb.
>
> I just don't want to break a lot of software.  :(

I understand that it's a compensation of speed.

>> As a bundled library (in my opiniion):
>>
>>   One thing I don't like faster_csv.rb is String#parse_csv and
>>   Array#to_csv.  Please do not bring pollution to standard classes.
>>
>>   Kernel.CSV should be discussed well before introducing it.  Needed?
>>   (We already have Kernel.URI though...)
>
> Maybe I'm alone in this thinking, but I'm not bothered by conversion
> methods like this.  It's also fairly common (to_set(), to_yaml(), etc.).

I don't like those, too.  We should wait selector namespace. (IMO)

Regards,
// NaHi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iD8DBQFEeQYEf6b33ts2dPkRAjzUAKC0EAhNXkk+IiAutrWOlhWqctdXvQCgohio
BP16cwL6jGN3vnwUGj9k3sQ=
=Prwb
-----END PGP SIGNATURE-----
James G. (Guest)
on 2006-05-28 07:01
(Received via mailing list)
On May 27, 2006, at 9:09 PM, NAKAMURA, Hiroshi wrote:

>>> Just replace csv.rb with faster_csv.rb.
>>
>> I just don't want to break a lot of software.  :(
>
> I understand that it's a compensation of speed.

Only if you go through that interface.  The FasterCSV interface is
still quite quick.

Let me rethink it a little.  It was optimized for developer
productivity when I built it.  I might be able to do better looking
at it from the idea of easy transitioning for the users.

Of course, I handle open() quite differently, so we're going to have
problems merging both models of that method in the CSV class.  Hmm...

James Edward G. II
Daniel -. (Guest)
on 2006-05-29 05:33
(Received via mailing list)
Sorri to stray off topic here, but is there any tutorials for FasterCSV?

I could not find one from Mr Google and I would like to use it for a
small
project I'm working on.

Thanx
Dan
Logan C. (Guest)
on 2006-05-29 06:16
(Received via mailing list)
On May 28, 2006, at 9:30 PM, Daniel N wrote:

> Sorri to stray off topic here, but is there any tutorials for
> FasterCSV?
>
> I could not find one from Mr Google and I would like to use it for
> a small
> project I'm working on.
>
> Thanx
> Dan

It's not exactly a tutorial, but the examples in the docs [1] should
be enough to get you started.

[1] http://fastercsv.rubyforge.org/classes/FasterCSV.html
Daniel -. (Guest)
on 2006-05-29 07:42
(Received via mailing list)
Thanx Logan,

Sorry I should have been a bit clearer.  I read those but what I had
trouble
with was when I receive the csv file from a web form.  It gives me a
StringIO object and I don't know what to do with it.

Any help is greatly appreciated.
James G. (Guest)
on 2006-05-29 08:09
(Received via mailing list)
On May 28, 2006, at 10:39 PM, Daniel N wrote:

> Thanx Logan,
>
> Sorry I should have been a bit clearer.  I read those but what I
> had trouble
> with was when I receive the csv file from a web form.  It gives me a
> StringIO object and I don't know what to do with it.
>
> Any help is greatly appreciated.

FasterCSV handles StringIO objects just fine:

 >> require "stringio"
=> true
 >> require "fastercsv"
=> true
 >> data = StringIO.new(%Q{1,2,"3,4",5})
=> #<StringIO:0x6ce300>
 >> FasterCSV.parse(data)
=> [["1", "2", "3,4", "5"]]

Hope that helps.

James Edward G. II
Daniel -. (Guest)
on 2006-05-29 08:13
(Received via mailing list)
Cheers.  thanx so much for that.

I'll get out of your thread now
James G. (Guest)
on 2006-05-31 04:16
(Received via mailing list)
On May 27, 2006, at 11:20 AM, Yukihiro M. wrote:

> I agree.  I don't want to have two independent CSV readers in the
> distribution.  It's OK that compatible mode is slow, or gives
> obsoletion warning.  But we have to discuss about when it should
> happen - during 1.8.x or for 1.9.

Alright, I've thought a lot about this and there is really one big
issue here:  CSV and FasterCSV are not 100% compatible.  If it was
just the method arguments, we could get pretty close to perfect, but
CSV does some odd things like confuse open() with foreach() that I
chose to avoid in FasterCSV.  Because of that, I can't always be sure
what to do when user code calls a given method.

That leaves two options, in my opinion:

1.  CSV's compatibility mode handles most of the issues very well and
I'm pretty sure I can remove most of the speed penalty.  If we go
with that, we have a pretty workable solution right now with one big
gotcha:  you can require a file named csv.rb and use CSV just fine,
but the good stuff will actually be hiding under FasterCSV (in the
same file).  I have to keep them separate, because of the
compatibility issues mentioned above.  This, to me, is the only sane
way to go if we want to target the 1.8.x branch.  It would still
break some software, if they use the unusual features of CSV, but I
suspect this is quite rare.
2.  We could drop compatibility and rename FasterCSV to CSV.  This
way people get all the good stuff where they expect it.  However,
this would break a lot of CSV software (most of it, in fact), so it
only seems reasonable when targeting 1.9.x and up.

My thought is that the second option seems preferable.  If we train
people to use FasterCSV, then we just have to switch them again down
the road if we want to revert to CSV.  We don't really gain many big
advantages for the switch either (speed, if I can eliminate the
penalty, but not header parsing or the other good FasterCSV
features).  That doesn't sound like it's worth breaking software over.

In summary, I recommending targeting 1.9.x with no compatibility mode
and renaming FasterCSV to CSV.  Am I making sense here?

James Edward G. II
unknown (Guest)
on 2006-05-31 04:35
(Received via mailing list)
On Wed, 31 May 2006, James Edward G. II wrote:

> get all the good stuff where they expect it.  However, this would break a lot
> In summary, I recommending targeting 1.9.x with no compatibility mode and
> renaming FasterCSV to CSV.  Am I making sense here?
>
> James Edward G. II

i know matz is against it, but i really think we should have both.  we
have

   ftools and fileutils

   date and date2

   getoptlong, getopts, parsearg, and optparse

   monitor, mutex, and sync

   runit and test/unit

and so on.

i have quite a bit of code that does things like

   CSV::Row

   CSV::Cell

etc.  it'd be very upset if i had to re-write all of it.  it does little
to
help people love ruby when scripts that worked stop working after an
upgrade.
that said, i'm 100% for having faster csv in the dist.  i just don't see
what's wrong with a few extra pure ruby files in there - they are very
tiny.

cheers.

-a
Yukihiro M. (Guest)
on 2006-05-31 05:27
(Received via mailing list)
Hi,

In message "Re: FasterCSV RCR?"
    on Wed, 31 May 2006 09:34:32 +0900, removed_email_address@domain.invalid 
writes:

|i know matz is against it, but i really think we should have both.  we have
|
|   ftools and fileutils
|
|   date and date2
|
|   getoptlong, getopts, parsearg, and optparse
|
|   monitor, mutex, and sync
|
|   runit and test/unit
|
|and so on.

They are the mistakes that I try to avoid making again.

|   ftools and fileutils
|   getoptlong, getopts, parsearg, and optparse

They are unfortunate mistakes I (we) made.

|   date and date2

date2 = date + extra libraries.

|   monitor, mutex, and sync

They are (somewhat) different.

|   runit and test/unit

runit is a compatibility library based on test/unit.

							matz.
James G. (Guest)
on 2006-06-02 21:41
(Received via mailing list)
On May 30, 2006, at 7:13 PM, James Edward G. II wrote:

> 2.  We could drop compatibility and rename FasterCSV to CSV.  This
> way people get all the good stuff where they expect it.  However,
> this would break a lot of CSV software (most of it, in fact), so it
> only seems reasonable when targeting 1.9.x and up.

I have created an RCR for this option:

http://www.rcrchive.net/rcr/show/338

Those in favor (or against) may wish to vote.

James Edward G. II
NAKAMURA, Hiroshi (Guest)
on 2006-06-04 14:32
(Received via mailing list)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I started thinking that just csv.rb should be faster.

James Edward G. II wrote:
> method arguments, we could get pretty close to perfect, but CSV does
> some odd things like confuse open() with foreach() that I chose to avoid
> in FasterCSV.  Because of that, I can't always be sure what to do when

Can you please explain what are "odd"?  FasterCSV.build_csv_interface
seems to be a simple delegator.

> 2.  We could drop compatibility and rename FasterCSV to CSV.  This way
> people get all the good stuff where they expect it.  However, this would

Can you please explain what are "good"?  I'll introduce those features
into csv.rb.  Do those features depend on faster_csv.rb specific
behavior?

Regards,
// NaHi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iQEVAwUBRIK2Bh9L2jg5EEGlAQJzbwgAxbK3zt4fLKPMmsADi72pYSz17dsFMuCM
6mshzj7oFzPzLdnefBJLqCRX7ixZXKzbC5KIZ2U79uiT8lFMLy5/r6NuUbwl2Moj
+n28chcdW2NF3MvJ16crI+WMD3OOuivtVKDxOt38wxT/r6iW4BCbV7Xt7FKYJAsN
eFgguVkeaVBljtEGtL6kkUTjgiOD8htCnAJVbxUpVQwRrX5AjhzJoogLHb03OOfX
EFJ4S5Nm2sQy/wcIX6JZued3pytIh6jEZtp3Nz3xY/Ca61aQB6xMPjk99MTN2pqR
hx8xF4/G7I45hjC3gMQLcEfhHNtcAzfYfDO/DU45pg0K3uLMLQxkJg==
=mnFD
-----END PGP SIGNATURE-----
James G. (Guest)
on 2006-06-04 20:45
(Received via mailing list)
On Jun 4, 2006, at 5:30 AM, NAKAMURA, Hiroshi wrote:

> James Edward G. II wrote:
>> method arguments, we could get pretty close to perfect, but CSV does
>> some odd things like confuse open() with foreach() that I chose to
>> avoid
>> in FasterCSV.  Because of that, I can't always be sure what to do
>> when
>
> Can you please explain what are "odd"?

My biggest complaint with CSV is that open() behaves "oddly" and thus
defeats all my normal expectations:

 >> File.open("example.csv", "w") do |csv|
?>   csv.puts "1,2,3"
 >>   csv.puts "a,b,c"
 >> end
=> nil
 >> require "csv"
=> true
 >> # typical Ruby style reading...
?> File.open("example.csv") do |file|
?>   file.each { |row| p row }
 >> end
"1,2,3\n"
"a,b,c\n"
=> #<File:example.csv (closed)>
 >> # or...
?> File.foreach("example.csv") do |row|
?>   p row
 >> end
"1,2,3\n"
"a,b,c\n"
=> nil
 >> # CSV's "odd" open() method...
 >> CSV.open("example.csv", "r") do |row|  # "r" required
?>   p row  # we get rows, not the file object
 >> end
["1", "2", "3"]
["a", "b", "c"]
=> nil

Of course, if you open in a writing mode, you do get a file like
object.  It's inconsistent.

I'm confused about why CSV does this, since it offers the foreach()
method, which normally fills this role.

Other CSV oddities (my opinion):

*  I always have to think, "Now do I want the *_line() method or the
*_row() method here..."
*  Most methods take a field separator and a row separator, but
foreach() and readlines() only take the row separator.
*  I have to set a field separator when I really just want to set a
row separator.
*  A method called "generate_line()" doesn't involve a line ending.

>> 2.  We could drop compatibility and rename FasterCSV to CSV.  This
>> way
>> people get all the good stuff where they expect it.  However, this
>> would
>
> Can you please explain what are "good"?  I'll introduce those features
> into csv.rb.

Here's a selection of some features from my CHANGELOG that I am not
aware of in CSV:

* Added built-in and custom data converters.  Built-in handle numbers
and dates.
* Added auto-discovery for <tt>:row_sep</tt> (now the default).
* Added FasterCSV::filter() for easy Unix-like CSV filters.
* Added support for accessing fields by headers.
   * Headers can have their own converters.
   * Headers can be skipped or returned as needed.
   * FasterCSV::Row allows index or header access while retaining
order and
     allowing for duplicate headers.
* <tt>:headers</tt> can now be set to an Array of headers to use.
* <tt>:headers</tt> can now be set to an external CSV String of
headers to use.
* Provided support for the serialization of custom Ruby objects using
CSV.
* Added FasterCSV::instance and FasterCSV()/FCSV() shortcuts for easy
output.

James Edward G. II
NAKAMURA, Hiroshi (Guest)
on 2006-06-05 04:56
(Received via mailing list)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

James Edward G. II wrote:
>>> File.open("example.csv", "w") do |csv|
> "1,2,3\n"
>>> CSV.open("example.csv", "r") do |row|  # "r" required
> ?>   p row  # we get rows, not the file object
>>> end
> ["1", "2", "3"]
> ["a", "b", "c"]
> => nil
>
> Of course, if you open in a writing mode, you do get a file like
> object.  It's inconsistent.

I can understand your frustration about this point.  When I wrote csv.rb
at first, I thought all csv users would do the following when I define
reader style.

  CSV.open("filename.csv", "r") do |reader|
    reader.each do |row|
      ...do something...
    end
  end

Why don't we just write like this;

  CSV.open("filename.csv", "r") do |row|
    ...do something...
  end

I know you are considering that IO-ish methods are important.  But I
don't think CSV object should handle IO methods like fcntl, fileno,
seek, tell, tty?, and so on.  Would you please tell me typical and
pragmatic examples of reader style, except 'each'?

> I'm confused about why CSV does this, since it offers the foreach()
> method, which normally fills this role.

foreach and readlines are added recently from IO.  Now I think it was a
bad choice though...

> Other CSV oddities (my opinion):

Thanks!

> *  I always have to think, "Now do I want the *_line() method or the
> *_row() method here..."

Users don't need to use *_line and *_row methods I think.  When do you
use generate_line?

> *  Most methods take a field separator and a row separator, but
> foreach() and readlines() only take the row separator.

See IO.foreach and IO.readlines.  But as I wrote above, CSV should not
have these methods...

> *  I have to set a field separator when I really just want to set a row
> separator.

csv.rb in svn repository supports pseudo-keyword-like-method-argument
style.  I'll merge it ruby's csv repository before the next release.
http://dev.ctor.org/csv/browser/trunk/lib/csv.rb

# I defined keywords :fs and :rs but it should be :col_sep and :row_sep
# in conformity with faster_csv.

> *  A method called "generate_line()" doesn't involve a line ending.

Do not use it. :-)  At least users rarely use it I think.

I hope that
  csv.rb's open + read + block does not work as you expected
is the only and the big frustrated point of csv.rb (...if csv.rb is
enough faster :-)

>>> 2.  We could drop compatibility and rename FasterCSV to CSV.  This way
>>> people get all the good stuff where they expect it.  However, this would
>>
>> Can you please explain what are "good"?  I'll introduce those features
>> into csv.rb.
>
> Here's a selection of some features from my CHANGELOG that I am not
> aware of in CSV:

Thanks.  I'll look into this.  I hope those features are pluggable into
csv.rb and other modules like DBI, spreadsheet related things, HTML
table formatters, etc.  I think some of these features are table
specific, not CSV.

Regards,
// NaHi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iQEVAwUBRIOAux9L2jg5EEGlAQLJ7Qf/RTK7xk0KDqlqJ8vuDHY9cfuQLGJ+0Re2
rNwZjSHlXiZ/0bqlJ2ZXcsAFiK1BWeigfxvZbQJg5n3rqXLaYhYSZ0bsMN8q7CrM
L2C+ExEWQwZqKMWfOXFmIgCV6ynOR+FXdwA4hP4BcYY9xaidYR86wRCT/oBG5cvg
FYXSSFO74y4265mDggPfphM4vUqWaDz6kv0J4oX8X1pQ/aKao9tiAzFyr7RcyQXR
TCD8koK1IAqstQ0AEjNvTVJUkThBs00JJYuLjWMCZSFbZzUX6fO0Bo9S+1V5B1oX
JI5+oi4hqYWO5yXM4Rjp+wU5lcLuT9KWgEimGhdifLj05h/N90q/1Q==
=IbUC
-----END PGP SIGNATURE-----
James G. (Guest)
on 2006-06-05 18:51
(Received via mailing list)
On Jun 4, 2006, at 7:55 PM, NAKAMURA, Hiroshi wrote:

>
> Why don't we just write like this;
>
>   CSV.open("filename.csv", "r") do |row|
>     ...do something...
>   end

That's why we have foreach().  Better to use that and gain all the
familiarity of Ruby programmers who are use to things working that way.

> I know you are considering that IO-ish methods are important.  But I
> don't think CSV object should handle IO methods like fcntl, fileno,
> seek, tell, tty?, and so on.  Would you please tell me typical and
> pragmatic examples of reader style, except 'each'?

If people only did what I could think of, programming would be very
boring.  ;)  It took me five or ten minutes to make all those methods
available and now they are there if someone needs them.

I can tell you that it has already come in handy.  I got a bug report
that the line numbers in errors were off, because CSV allows embedded
\n characters in fields.  To fix it, I overrode IO's lineno() method
with correct behavior.  This seems very natural and the added bonus
is that you can now get a CSV aware line number.

>> I'm confused about why CSV does this, since it offers the foreach()
>> method, which normally fills this role.
>
> foreach and readlines are added recently from IO.  Now I think it
> was a
> bad choice though...

That makes me sad to hear.  foreach() is easily my most used method
with CSV and FasterCSV.  I like readlines() too.

I still can't think of any good reason not to just follow Ruby's
interface as much as is possible and natural.  To do anything else
forces programmers to adapt their expectations for no reason I can
understand.

>> *  I always have to think, "Now do I want the *_line() method or the
>> *_row() method here..."
>
> Users don't need to use *_line and *_row methods I think.  When do you
> use generate_line?

I'm pretty sure we want to have our CSV library support data not in
files.  Am I missing something?  Is there a better way to get a CSV
string with your library?

>> *  Most methods take a field separator and a row separator, but
>> foreach() and readlines() only take the row separator.
>
> See IO.foreach and IO.readlines.

That's comparing apples and oranges.  IO.foreach() doesn't need to be
aware of fields, but CSV.foreach() does.  IO.open() doesn't support a
field separator or a row separator, but your CSV.open() does because
it is needed.

> # in conformity with faster_csv.
:fs and :rs are fine with me.  It's consistent with your interface.

>> Here's a selection of some features from my CHANGELOG that I am not
>> aware of in CSV:
>
> Thanks.  I'll look into this.  I hope those features are pluggable
> into
> csv.rb and other modules like DBI, spreadsheet related things, HTML
> table formatters, etc.  I think some of these features are table
> specific, not CSV.

This leads me naturally to the question:  is there any good reason to
reinvent FasterCSV, when we could just use FasterCSV?  ;)

James Edward G. II
NAKAMURA, Hiroshi (Guest)
on 2006-06-06 05:43
(Received via mailing list)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

James Edward G. II wrote:
> That's why we have foreach().  Better to use that and gain all the
> familiarity of Ruby programmers who are use to things working that way.

> I still can't think of any good reason not to just follow Ruby's
> interface as much as is possible and natural.  To do anything else
> forces programmers to adapt their expectations for no reason I can
> understand.

I think I still have not been able to explain well what's the difference
of our viewpoint I think.  You think a CSV object is an IO.  But I don't
think so and I defined Writer and Reader in csv.rb.  It's not 'natural'
from my viewpoint.  That's why I think 'foreach' and 'readlines' should
not be added.

I feel a sentence "Comma Separated Value is an IO" strange.  What do you
think about it?  FasterCSV should be CSVIO or CSV::IO, no?

>> I know you are considering that IO-ish methods are important.  But I
>> don't think CSV object should handle IO methods like fcntl, fileno,
>> seek, tell, tty?, and so on.  Would you please tell me typical and
>> pragmatic examples of reader style, except 'each'?
>
> If people only did what I could think of, programming would be very
> boring.  ;)  It took me five or ten minutes to make all those methods
> available and now they are there if someone needs them.

Agreed to the first sentence.  But I don't think we should do all we can
do even if it's easy.

> I can tell you that it has already come in handy.  I got a bug report
> that the line numbers in errors were off, because CSV allows embedded \n
> characters in fields.  To fix it, I overrode IO's lineno() method with
> correct behavior.  This seems very natural and the added bonus is that
> you can now get a CSV aware line number.

Thank you for the example.  CSVIO#lineno or CSV::IO#lineno seems
reasonable for me.

But half of methods you defined as a delegator still seems not
meaningful for me.

  # * binmode()
  # * close()
  # * close_read()
  # * close_write()
  # * closed?()
  # * eof()
  # * eof?()
  # * fcntl()
  # * fileno()
  # * flush()
  # * fsync()
  # * ioctl()
  # * isatty()
  # * pid()
  # * pos()
  # * reopen()
  # * rewind()
  # * seek()
  # * stat()
  # * sync()
  # * sync=()
  # * tell()
  # * to_i()
  # * to_io()
  # * tty?()

# above is excerpted from faster_csv.rb/0.2.0

>>> *  I always have to think, "Now do I want the *_line() method or the
>>> *_row() method here..."
>>
>> Users don't need to use *_line and *_row methods I think.  When do you
>> use generate_line?
>
> I'm pretty sure we want to have our CSV library support data not in
> files.  Am I missing something?  Is there a better way to get a CSV
> string with your library?

Please use CSV::Writer for that.

str = ''
writer = CSV::Writer.create(str)
writer << [1,2,3]
...
writer << [x,y,z]
writer.close
puts str

>>> *  Most methods take a field separator and a row separator, but
>>> foreach() and readlines() only take the row separator.
>>
>> See IO.foreach and IO.readlines.
>
> That's comparing apples and oranges.  IO.foreach() doesn't need to be
> aware of fields, but CSV.foreach() does.  IO.open() doesn't support a
> field separator or a row separator, but your CSV.open() does because it
> is needed.

Hmm.  I think "same name and different method arguments" is a bad design
because it confuses users.  But you already use (pseudo) keyword
argument style so you are thinking "but just adding arguments could be a
good design", right?

It could be.  I need more time to think about it.

>>> Here's a selection of some features from my CHANGELOG that I am not
>>> aware of in CSV:
>>
>> Thanks.  I'll look into this.  I hope those features are pluggable into
>> csv.rb and other modules like DBI, spreadsheet related things, HTML
>> table formatters, etc.  I think some of these features are table
>> specific, not CSV.
>
> This leads me naturally to the question:  is there any good reason to
> reinvent FasterCSV, when we could just use FasterCSV?  ;)

I wrote 'introduce' and meant 'I won't reinvent table specific
implementations.  I'll just get it from faster_csv, if it is pluggable'.

Regards,
// NaHi
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iQEVAwUBRITdFh9L2jg5EEGlAQIIEAf/VkUVW5+fzbBF4vBDpoAMQkfWC6OE/k58
XE8aIs5tQkvPT3k+63BuDnwbWqLTY6l346HRPAOmpqOna+99rYhXgy8kA6RbmI0A
btX0xtHSvb37TzugnY0GavZE2ABo00LYvdPn8xV/IrogVApN5Do/530Zv2AqbCMI
k2mG8am60JRS1OhwOSjEUHamBuCqiC26qu02t5MLTX+vtAyTXTCAOxTwKjciGW9p
NCj+nDadDI97kCmbikQMn/mcDvXDZ6fxSfvjIE4rNkCzav0RUxKHLSa9nqOiRGVD
SPAaEDB5DhqFvEcRCsC+2QKtKAKqYfffN1Tbyvf3fC/KM5dZUmMpZA==
=3UM7
-----END PGP SIGNATURE-----
James G. (Guest)
on 2006-06-06 17:51
(Received via mailing list)
On Jun 5, 2006, at 8:42 PM, NAKAMURA, Hiroshi wrote:

> I think I still have not been able to explain well what's the
> difference
> of our viewpoint I think.  You think a CSV object is an IO.  But I
> don't
> think so and I defined Writer and Reader in csv.rb.  It's not
> 'natural'
> from my viewpoint.  That's why I think 'foreach' and 'readlines'
> should
> not be added.

Yeah, to me CSV is just another data source I want to read from/write
to with slightly special handling of the lines.

The good news is that our users probably don't care what we think.
If we give them a quick and convenient way to read and write CSV, I
think they'll be happy.  ;)

Best of luck with your upgrades!

James Edward G. II
This topic is locked and can not be replied to.