Converting data to utf8 to load into an ActiveRecord?

Hi,
I’ve got a text file of data that I need to parse and loaded into my
Rails
app. I’ve created a migration script the loads and parses the file, but
some
of the data contains non-utf8 strings, causing an exception.

I’ve found two Windows only ways to convert the strings to utf8, but I
would
like to deploy my app on heroku. Any suggestions for a non OS specific
way
to convert?

This is my current implementation using chilkat:

require ‘db/chilkat’

class AddDataSources < ActiveRecord::Migration
def self.encode(str)
strObj = Chilkat::CkString.new()
strObj.appendAnsi(str)
strObj.urlDecode(“utf-8”)
end

def self.up
DataSource.delete_all
data = IO.readlines(‘db/data/data.txt’)
data.each { |line|
cells = line.split(‘,’)
DataSource.create(:code => cells[0],
:authors => encode(cells[1]))
}
end

def self.down
DataSource.delete_all
end
end


Henry
http://www.henrywagner.org/

2008/3/20, Henry W. [email protected]:

Hi,

I’ve got a text file of data that I need to parse and loaded into my Rails
app. I’ve created a migration script the loads and parses the file, but some
of the data contains non-utf8 strings, causing an exception.

I’ve found two Windows only ways to convert the strings to utf8, but I would
like to deploy my app on heroku. Any suggestions for a non OS specific way
to convert?

Iconv?

irb(main):001:0> require "iconv"
=> true
irb(main):002:0> Iconv.conv("utf-8", "windows-1252", "\200")
=> "€"

Stefan

Thanks, that works.

On Thu, Mar 20, 2008 at 12:12 PM, Stefan L. <
[email protected]> wrote:

I’ve found two Windows only ways to convert the strings to utf8, but I
=> “€”

Stefan


Henry
http://www.henrywagner.org/