Forum: Ruby Chaning the quote-character in csv parsing

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
1bc028840986928fb298a9eb456dbb22?d=identicon&s=25 Jens Auer (Guest)
on 2006-03-28 16:04
(Received via mailing list)
Hi,
I have a bunch of files containing lines as comma-seperated values.
Unfortunately, the character used for quoting is a single quote (') and
not the double quote ("). How can I tell the csv library (or fastercsv
or any other csv-parsing library) which character is used for quoting?
Some of the files contain fields like 'quoted, but with comma', which
are seperated into two fields at the comma:
irb(main):003:0> line = "one, 'quoted', 'quoted, but with comma'"
=> "one, 'quoted', 'quoted, but with comma'"
irb(main):006:0> CSV::parse_line('some words "some quoted text" some
more words', ' ')
=> ["some", "words", "some quoted text", "some", "more", "words"]
irb(main):001:0> require 'rubygems'
=> true
irb(main):002:0> require_gem 'fastercsv'
=> true
irb(main):004:0> line.parse_csv
=> ["one", " 'quoted'", " 'quoted", " but with comma'"]

The output should be ["one", "'quoted'", "'quoted, but with comma'"]
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2006-03-28 18:04
(Received via mailing list)
On Mar 28, 2006, at 8:03 AM, Jens Auer wrote:

> I have a bunch of files containing lines as comma-seperated values.
> Unfortunately, the character used for quoting is a single quote (')
> and not the double quote ("). How can I tell the csv library (or
> fastercsv or any other csv-parsing library) which character is used
> for quoting?

I'm not aware of a way to change it for either library, sadly.  If
your data is simple, you might get away with manipulating it into shape:

 >> require "csv"
=> true
 >> line = "one,'quoted','quoted, but with comma'"
=> "one,'quoted','quoted, but with comma'"
 >> CSV.parse_line(line.tr("'", '"'))
=> ["one", "quoted", "quoted, but with comma"]

Unfortunately, on of the examples you showed is more complicated than
that:

 >> line = 'some words "some quoted text" some more words'
=> "some words \"some quoted text\" some more words"
 >> CSV.parse_line('"' + line.gsub('"', '""').tr("'", '"') + '"')
=> ["some words \"some quoted text\" some more words"]

Those transforms need to be done on a field by field basis though, to
correctly convert data like that.

Hope that helps.

James Edward Gray II
This topic is locked and can not be replied to.