Hi!
I have problem with legacy database in MSSQL encoding.
The new app in Rails and most of the database is Utf8, while
tables from legacy MSSQL are encoded in ISO-8859-1.
I am not sure how to approach this problem. I would imagine
creation of some kind of filter with Iconv.conv("UTF-8", "ISO-8859-1",
input)
for reading sql statements and the reverse for write methods.
Any suggestions how this could be done?
Cheers,
Tomasz Bak
on 28.08.2007 10:55
on 28.08.2007 12:15
Hi,
So far I have come come with poor solution:
class String
require 'iconv'
def iso1_to_utf8
Iconv.conv("UTF-8", "ISO-8859-1", self) rescue self
end
def utf8_to_iso1
Iconv.conv("ISO-8859-1", "UTF-8", self) rescue self
end
end
class NCustomer < ActiveRecord::Base
...
def after_initialize
self.attributes.each do |k,v|
if v.class.to_s == "String" and !v.empty?
self[k] = v.iso1_to_utf8
end
end
end
def before_save
self.attributes.each do |k,v|
if v.class.to_s == "String" and !v.empty?
self[k] = v.utf8_to_iso1
end
end
end
...
end
Cheers,
Tomasz Bak
on 28.08.2007 21:31
Tom Bak wrote: > I have problem with legacy database in MSSQL encoding. > > The new app in Rails and most of the database is Utf8, while > tables from legacy MSSQL are encoded in ISO-8859-1. > > I am not sure how to approach this problem. I would imagine > creation of some kind of filter with Iconv.conv("UTF-8", "ISO-8859-1", > input) > for reading sql statements and the reverse for write methods. > > Any suggestions how this could be done? I think that everyone who's ever been cornered by this can attest that it can be a royal pain and that the options are all choosing between lesser evils. Last time I found myself there I exported the database to a plain text SQL file and then ran recode [1] on it. It worked well enough but I still found myself scanning over the file to correct tiny bits here and there. I'll admit that your solution has a ring to it -- being able to do it on the fly improves the time to deploy. I like it. If however speed or strict migration paths are an issue for you then recode may be an option to consider. [1] http://www.gnu.org/software/recode/ -- Roderick van Domburg http://www.nedforce.nl
on 03.09.2007 11:02
i'm having a quite similar issue atm and i'm unsure what to do. same thing, Legacy MSSQL Database with windows-1252 charset (so iso compatible) on the legacy tables, and the Rails app purely using UTF-8. i had no problems with encodings while using the sqlserver driver in ado mode and setting: WIN32OLE.codepage = WIN32OLE::CP_UTF8 but unfortunately the ado driver is leaking memory so badly that the mongrels need to be restarted every 2 hours. when i switch to odbc mode, the memory leak is gone but with ODBC the codepage setting doesn't work anymore, so i get messed up texts from those old tables. so i have to choose between the 2 evils: 1. memory leaking app with proper encoding or 2. stable app with garbled text
on 03.09.2007 11:33
I solved my problem with own plugin: recode_legacy_database :from => "ISO-8859-1", :to => "UTF-8" Which translates fields in models on read and before write to database. Cheers, Tomasz
on 03.09.2007 12:09
Tom Bak wrote: > I solved my problem with own plugin: > > recode_legacy_database :from => "ISO-8859-1", :to => "UTF-8" > > Which translates fields in models on read and before write to database. > > Cheers, > Tomasz so this is for a one time conversion of the whole db? i don't have the luxury to do that because the legacy tables still get accessed by the legacy app :(
on 03.09.2007 12:15
neongrau __ wrote: > Tom Bak wrote: >> I solved my problem with own plugin: >> >> recode_legacy_database :from => "ISO-8859-1", :to => "UTF-8" >> >> Which translates fields in models on read and before write to database. >> >> Cheers, >> Tomasz > > so this is for a one time conversion of the whole db? Exacly the oposit. I probably should change naming. I recodes db "on fly". Check the code. Cheers, Tomasz Bak
on 05.11.2007 16:39
Tom Bak wrote: > I solved my problem with own plugin: > > recode_legacy_database :from => "ISO-8859-1", :to => "UTF-8" > > Which translates fields in models on read and before write to database. > > Cheers, > Tomasz Tom, How is the usage of this plugin? I have to run this command manual? When I have to put this code? Thanks, Julio
on 18.12.2007 14:46
Julio Damasceno wrote: > Tom Bak wrote: >> I solved my problem with own plugin: >> >> recode_legacy_database :from => "ISO-8859-1", :to => "UTF-8" >> >> Which translates fields in models on read and before write to database. >> >> Cheers, >> Tomasz > > Tom, > > How is the usage of this plugin? I have to run this command manual? When > I have to put this code? > > > Thanks, > > Julio I did it this way : - Unzip the zip-file into you rails vendor/plugins directory - Add the following line to environments.rb ActiveRecord::Base.recode_legacy_database :from=>"ISO-8859-1",:to=>"UTF-8"