FasterCSV: preserving quoted strings


#1

Hi,

Google et al are failing me: How do I preserve quoted
CSV strings on output?

% cat > csv_quotes.rb << EOF
require ‘rubygems’
require ‘faster_csv’
require ‘test/unit’

class ConversionTest < Test::Unit::TestCase
def test_preserve_quoted_strings
csv_data = ‘“string”,2,0.3’
assert_equal( csv_data, csv_data.parse_csv*’,’ )
end
end
EOF

% ruby -ws csv_quotes.rb
Loaded suite csv_quotes
Started
F
Finished in 0.005189 seconds.

  1. Failure:
    test_preserve_quoted_strings(ConversionTest) [csv_quotes.rb:8]:
    <"“string”,2,0.3"> expected but was
    <“string,2,0.3”>.

1 tests, 1 assertions, 1 failures, 0 errors

Thanks,


#2

On Feb 26, 2009, at 8:14 AM, BIl Kleb wrote:

Google et al are failing me: How do I preserve quoted
CSV strings on output?

I’m not totally sure I understand the question, but your test made it
look like your data was one field you wanted to be able to read in.
If so, you’ll need to write is out properly escaped first:

require ‘rubygems’
require ‘faster_csv’
require ‘test/unit’

class ConversionTest < Test::Unit::TestCase
def test_preserve_quoted_strings
field = ‘“string”,2,0.3’
csv = [field].to_csv # => “”"“string”",2,0.3"\n"
assert_equal(field, csv.parse_csv.first)
end
end

If I’m wrong and you meant for that to be three separate fields, then
just bust them up to get the valid CSV:

require ‘rubygems’
require ‘faster_csv’
require ‘test/unit’

class ConversionTest < Test::Unit::TestCase
def test_preserve_quoted_strings
fields = ‘“string”,2,0.3’.split(",")
csv = fields.to_csv # => “”"“string”"",2,0.3\n"
assert_equal(fields, csv.parse_csv)
end
end

Hope that helps.

James Edward G. II


#3

On Feb 26, 12:11 pm, James G. removed_email_address@domain.invalid wrote:

On Feb 26, 2009, at 8:14 AM, BIl Kleb wrote:

Google et al are failing me: How do I preserve quoted
CSV strings on output?

I’m not totally sure I understand the question, but your test made it
look like your data was one field you wanted to be able to read in.

Sorry for the confusion, my simplified test isn’t close enough
to my problem domain … I’ll try again in long form:

I have a CSV file with headers and rows like

“scheme”,“time_steps”,“dt”
“1storder”,2,0.5
“4thorder”,5,1.0

and I am using FasterCSV to read this CSV file to get
a hash of header=>value pairs for each row.

For each row worth of data, I create an output file of the form

&some_weird_name
scheme = “1storder”,
time_steps = 2,
dt = 0.5
/

What I’m currently getting is “1storder” without the quotation marks.
I need the data fields to retain their quotation marks like they
have in the original CSV file.

Regards,


#4

On Feb 26, 2009, at 12:44 PM, Bil K. wrote:

&some_weird_name
scheme = “1storder”,
time_steps = 2,
dt = 0.5
/

What I’m currently getting is “1storder” without the quotation marks.
I need the data fields to retain their quotation marks like they
have in the original CSV file.

Well, then you don’t really want a CSV parser.

Quotes in CSV data are used to indicate field grouping. In other
words, they are metadata about the content and it doesn’t make sense
for a parser to return those to you. It’s like how an XML parser
wouldn’t give you the equals sign used to set a tag attribute.

The way I see it you have two choices:

  1. Fix your data file so it’s proper CSV (making the quotes a part of
    the field data). For example, the first row would become:

“”“scheme”"",""“time_steps”"",""“dt”""

A quote is doubled to escape it in CSV and another set is added to
enclose each field, which is why they are tripled here.

  1. Decide that your data is not CSV and hand roll a parser to handle
    it. If you are sure fields won’t contain commas, that may be a simple
    as: fields = row.split(",").

Hope that helps.

James Edward G. II


#5

Bil K. wrote:

On Feb 26, 2:08�pm, James G. removed_email_address@domain.invalid wrote:

Well, then you don’t really want a CSV parser.

Quotes in CSV data are used to indicate field grouping. �In other �
words, they are metadata about the content and it doesn’t make sense �
for a parser to return those to you. �It’s like how an XML parser �
wouldn’t give you the equals sign used to set a tag attribute.

Ah, OK. That clears things up.

Thanks,

or use :force_quotes => true when FasterCSV.open or FasterCSV.new


#6

On Feb 26, 2:08 pm, James G. removed_email_address@domain.invalid wrote:

Well, then you don’t really want a CSV parser.

Quotes in CSV data are used to indicate field grouping. In other
words, they are metadata about the content and it doesn’t make sense
for a parser to return those to you. It’s like how an XML parser
wouldn’t give you the equals sign used to set a tag attribute.

Ah, OK. That clears things up.

Thanks,


#7

On Jul 9, 2009, at 5:59 AM, Marcus Mitchell wrote:

�
wouldn’t give you the equals sign used to set a tag attribute.

Ah, OK. That clears things up.

Thanks,

or use :force_quotes => true when FasterCSV.open or FasterCSV.new

That option causes FasterCSV to always quote fields on output. Bil
was asking if he could have the quotes left in his fields on input.

James Edward G. II