Forum: Ruby FasterCSV: preserving quoted strings

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
BIl Kleb (Guest)
on 2009-02-26 16:19
(Received via mailing list)
Hi,

Google et al are failing me: How do I preserve quoted
CSV strings on output?

% cat > csv_quotes.rb << EOF
require 'rubygems'
require 'faster_csv'
require 'test/unit'

class ConversionTest < Test::Unit::TestCase
   def test_preserve_quoted_strings
     csv_data = '"string",2,0.3'
     assert_equal( csv_data, csv_data.parse_csv*',' )
   end
end
EOF

% ruby -ws csv_quotes.rb
Loaded suite csv_quotes
Started
F
Finished in 0.005189 seconds.

   1) Failure:
test_preserve_quoted_strings(ConversionTest) [csv_quotes.rb:8]:
<"\"string\",2,0.3"> expected but was
<"string,2,0.3">.

1 tests, 1 assertions, 1 failures, 0 errors

Thanks,
James G. (Guest)
on 2009-02-26 19:14
(Received via mailing list)
On Feb 26, 2009, at 8:14 AM, BIl Kleb wrote:

> Google et al are failing me: How do I preserve quoted
> CSV strings on output?

I'm not totally sure I understand the question, but your test made it
look like your data was one field you wanted to be able to read in.
If so, you'll need to write is out properly escaped first:

require 'rubygems'
require 'faster_csv'
require 'test/unit'

class ConversionTest < Test::Unit::TestCase
  def test_preserve_quoted_strings
    field = '"string",2,0.3'
    csv   = [field].to_csv  # => "\"\"\"string\"\",2,0.3\"\n"
    assert_equal(field, csv.parse_csv.first)
  end
end

If I'm wrong and you meant for that to be three separate fields, then
just bust them up to get the valid CSV:

require 'rubygems'
require 'faster_csv'
require 'test/unit'

class ConversionTest < Test::Unit::TestCase
  def test_preserve_quoted_strings
    fields = '"string",2,0.3'.split(",")
    csv    = fields.to_csv  # => "\"\"\"string\"\"\",2,0.3\n"
    assert_equal(fields, csv.parse_csv)
  end
end

Hope that helps.

James Edward G. II
Bil K. (Guest)
on 2009-02-26 20:46
(Received via mailing list)
On Feb 26, 12:11 pm, James G. <removed_email_address@domain.invalid> wrote:
> On Feb 26, 2009, at 8:14 AM, BIl Kleb wrote:
>
> > Google et al are failing me: How do I preserve quoted
> > CSV strings on output?
>
> I'm not totally sure I understand the question, but your test made it  
> look like your data was one field you wanted to be able to read in.  

Sorry for the confusion, my simplified test isn't close enough
to my problem domain ... I'll try again in long form:

I have a CSV file with headers and rows like

 "scheme","time_steps","dt"
 "1storder",2,0.5
 "4thorder",5,1.0

and I am using FasterCSV to read this CSV file to get
a hash of header=>value pairs for each row.

For each row worth of data, I create an output file of the form

&some_weird_name
  scheme = "1storder",
  time_steps = 2,
  dt = 0.5
/

What I'm currently getting is "1storder" without the quotation marks.
I need the data fields to retain their quotation marks like they
have in the original CSV file.

Regards,
James G. (Guest)
on 2009-02-26 21:10
(Received via mailing list)
On Feb 26, 2009, at 12:44 PM, Bil K. wrote:

>
> &some_weird_name
>  scheme = "1storder",
>  time_steps = 2,
>  dt = 0.5
> /
>
> What I'm currently getting is "1storder" without the quotation marks.
> I need the data fields to retain their quotation marks like they
> have in the original CSV file.

Well, then you don't really want a CSV parser.

Quotes in CSV data are used to indicate field grouping.  In other
words, they are metadata about the content and it doesn't make sense
for a parser to return those to you.  It's like how an XML parser
wouldn't give you the equals sign used to set a tag attribute.

The way I see it you have two choices:

1.  Fix your data file so it's proper CSV (making the quotes a part of
the field data).  For example, the first row would become:

"""scheme""","""time_steps""","""dt"""

A quote is doubled to escape it in CSV and another set is added to
enclose each field, which is why they are tripled here.

2.  Decide that your data is not CSV and hand roll a parser to handle
it. If you are sure fields won't contain commas, that may be a simple
as:  fields = row.split(",").

Hope that helps.

James Edward G. II
Bil K. (Guest)
on 2009-02-26 23:51
(Received via mailing list)
On Feb 26, 2:08 pm, James G. <removed_email_address@domain.invalid> wrote:
>
> Well, then you don't really want a CSV parser.
>
> Quotes in CSV data are used to indicate field grouping.  In other  
> words, they are metadata about the content and it doesn't make sense  
> for a parser to return those to you.  It's like how an XML parser  
> wouldn't give you the equals sign used to set a tag attribute.

Ah, OK.  That clears things up.

Thanks,
Marcus M. (Guest)
on 2009-07-09 14:59
Bil K. wrote:
> On Feb 26, 2:08�pm, James G. <removed_email_address@domain.invalid> wrote:
>>
>> Well, then you don't really want a CSV parser.
>>
>> Quotes in CSV data are used to indicate field grouping. �In other �
>> words, they are metadata about the content and it doesn't make sense �
>> for a parser to return those to you. �It's like how an XML parser �
>> wouldn't give you the equals sign used to set a tag attribute.
>
> Ah, OK.  That clears things up.
>
> Thanks,

or use :force_quotes => true when FasterCSV.open or FasterCSV.new
James G. (Guest)
on 2009-07-09 18:32
(Received via mailing list)
On Jul 9, 2009, at 5:59 AM, Marcus Mitchell wrote:

>>> �
>>> wouldn't give you the equals sign used to set a tag attribute.
>>
>> Ah, OK.  That clears things up.
>>
>> Thanks,
>
> or use :force_quotes => true when FasterCSV.open or FasterCSV.new

That option causes FasterCSV to always quote fields on output.  Bil
was asking if he could have the quotes left in his fields on input.

James Edward G. II
This topic is locked and can not be replied to.