Hi all,
Just wondering… Are there any plans to include i18n support in Rails
anytime soon?
I guess this is about the only feature I’m realy missing in Rails.
Any thoughts?
Regards,
Harm de Laat
Hi all,
Just wondering… Are there any plans to include i18n support in Rails
anytime soon?
I guess this is about the only feature I’m realy missing in Rails.
Any thoughts?
Regards,
Harm de Laat
second that!
Just wondering… Are there any plans to include i18n support in Rails
anytime soon?
Take a look at http://www.globalize-rails.org/wiki/
Kasper
Nicolai Reuschling wrote:
David stated in the “Snakes and Rubies” video (
Snakes and Rubies downloads | Django), I18N is not difficult in
a
technical way. It’s mostly string replacement.
No, it’s not that easy, as I18N is a bit more complicated, and requires
enforcement of some rules about code and data organization, see
FAQ - Basic Questions ,
FAQ - Basic Questions for example.
At present time Rails can’t even handle UTF8 properly, and you ask I18N?
I18N will be in Rails only as soon as you will implement it, don’t
rely on core team, thay have a lot more things to improve and fix, and
I18N isn’t what they are paid for.
dseverin wrote:
At present time Rails can’t even handle UTF8 properly, and you ask I18N?
Could you elaborate on this? It seems to be working alright for me.
Jeroen
Jeroen H. wrote:
dseverin wrote:
At present time Rails can’t even handle UTF8 properly, and you ask I18N?
Could you elaborate on this? It seems to be working alright for me.
Jeroen
It also seemed to work alright for me, but once it began to fail
http://www.fngtps.com/2006/01/encoding-in-rails#comment24
Could you just review the code of mentioned there components and tell me
that it will never break on UTF8?
E.g. validates_length_of is definitely broken,
String#blank? doesn’t take into account ALL Unicode space chars,
active_support/core_ext/string/access.rb is completely broken,
action_view/helpers/text_helper.rb has “fixed” truncate(), but broken
excerpt()
– and that list can be still incomplete…
dseverin wrote:
– and that list can be still incomplete…
Yes I’m afraid you’re absolutely right. There also a wiki entry about
this: Peak Obsession
You would think a Japanese language would handle this properly though?
Couldn’t you globally alias all string methods to use their UTF-8
equivalents once kcode=‘utf-8’ ? I still know very little about rails
internals and ruby so I don’t really know if it’s hard to fix.
Jeroen
Jeroen H. wrote:
You would think a Japanese language would handle this properly though?
I’m no expert, but apparently UTF-8 is far from the most popular
Japanese encoding, and there are allegedly some very cogent arguments
against it that aren’t a matter of NIH syndrome. That’s why Ruby
doesn’t do Unicode well - not because it’s not been thought about, but
because it’s been thought about too much
Couldn’t you globally alias all string methods to use their UTF-8
equivalents once kcode=‘utf-8’ ? I still know very little about rails
internals and ruby so I don’t really know if it’s hard to fix.
It is Once you lose the assumption that a codepoint is one byte,
the concept of String#length gets really murky, for example.
Alex Y. wrote:
Couldn’t you globally alias all string methods to use their UTF-8
equivalents once kcode=‘utf-8’ ? I still know very little about rails
internals and ruby so I don’t really know if it’s hard to fix.
It is Once you lose the assumption that a codepoint is one byte,
the concept of String#length gets really murky, for example.
Yes, and you can have up to three lengths: String#byte_length,
String#codepoint_length, String#grapheme_cluster_length (e.g. NFD in
MacOS filenames).
Besides:
Most of Rails internal processing expects ASCII strings (routing and
template path magic, SQL model_table_name <-> ModelTableName etc), and
expect one byte - is one ASCII char.
Overriding globally String methods with their UTF8 equivalents is
risky, and can give unpredictable faults (Julian’s unicode_hacks caused
Webrick work improperly, and in my application ActionMailer failed), as
different parts and libraries which they reference can expect exactly
byte String methods.
So, I think, as it will hardly be fixed (YAGNI, men, YAGNI, web 2.0 is
ASCII!) i18n is still in distant perspective.
dseverin wrote:
Yes, and you can have up to three lengths: String#byte_length,
String#codepoint_length, String#grapheme_cluster_length (e.g. NFD in
MacOS filenames).
My point precisely. And then what do you do about the byte order mark?
Besides:
So, I think, as it will hardly be fixed (YAGNI, men, YAGNI, web 2.0 is
ASCII!) i18n is still in distant perspective.
Maybe not that far off. According to this:
http://redhanded.hobix.com/inspect/futurismUnicodeInRuby.html
YARV should handle strings a little more sensibly. Once that’s in
place, the string replacing aspect should be a little simpler.
David stated in the “Snakes and Rubies” video (
Snakes and Rubies downloads | Django), I18N is not difficult in
a
technical way. It’s mostly string replacement.
Hi,
On Mon, 30 Jan 2006 14:31:57 +0100
dseverin [email protected] wrote:
It also seemed to work alright for me, but once it began to fail
Could you just review the code of mentioned there components and tell me
that it will never break on UTF8?E.g. validates_length_of is definitely broken,
I’ve used a patch below.
It works well at least Japanese/UTF-8.
#You need to call $KCODE=“u” first.
— validations.rb.old 2006-01-31 02:22:42.000000000 +0900
+++ validations.rb 2006-01-31 02:41:03.000000000 +0900
@@ -459,7 +459,7 @@
message = (options[:message] ||
options[message_options[option]]) % option_value
validates_each(attrs, options) do |record, attr, value|
record.errors.add(attr, message) unless !value.nil? and
value.size.method(validity_checks[option])[option_value]
record.errors.add(attr, message) unless !value.nil? and
value.split(//).size.method(validity_checks[option])[option_value]
end
end
end
It may be better to adde String#char_length something like as:
class String
def char_length
split(//).size
end
end
On 30-jan-2006, at 14:44, Jeroen H. wrote:
You would think a Japanese language would handle this properly
though? Couldn’t you globally alias all string methods to use their
UTF-8 equivalents once kcode=‘utf-8’ ? I still know very little
about rails internals and ruby so I don’t really know if it’s hard
to fix.
I did exactly that in my Unicode Hacks plugin, however it breaks
other software in a nasty, unpredictable way.
The article by Thijs on Fingertips (incl. the comments) analyzes the
breakage more thoroughly.
–
Julian ‘Julik’ Tarkhanov
me at julik.nl
On Mon, 30 Jan 2006, Alex Y. wrote:
Jeroen H. wrote:
You would think a Japanese language would handle this properly though?
I’m no expert, but apparently UTF-8 is far from the most popular
Japanese encoding, and there are allegedly some very cogent arguments
against it that aren’t a matter of NIH syndrome.
I feel that I’m pretty familiar with this. I live in Japan, speak, read
and write some Japanese, and have done a fair amount of I18N work for
English+Japanese web sites.
The arguments Japanese critics have against Unicode are, for the most
part, complete rubbish. They sum up to more or less the following:
1. Unicode 1.0 has issue X. Answer: err...try using maybe only a
five-year-old spec., like Unicode 2.0? Or get really genki and
upgrade to 3.0, maybe even before 4.0 comes out!
2. Important characters are missing. Answer: they're missing from
JIS character sets, too. And more are missing from JIS character
sets than Unicode. Unicode has *every single character* available
in
Shift-JIS, EUC-JP and ISO-2022-JP, so if you’re switching from
those
(which is what almost every system in Japan uses), you lose not one
single character you had before.
3. You can't tell the difference between Chinese and Japanese
in Unicode. Answer: in the same sense that you can't tell the
difference between French and English in ISO-8859-1. We have other
methods for doing that that don't involve adding characters to a
character set. (Some Japanese apparently feel that Unicode should
do
the equivalant of having different “French” and “English” versions
of the letter “a”.)
4. UTF-8 takes more space. Answer: on a typical web page, it takes
7% more space than Shift-JIS or EUC-JP. It's not a big deal, and
is actually a very small price to pay for the benefits you get.
However, there are other encodings available.
The Japanese critics tend to ignore other things that benefit them.
Some, they just don’t care about, such as the ability to have French and
Japanese on the same page. Others, such as having “generic” web-based
message board systems and other programs “just work” without any extra
effort on the part of a foreign developer, they really ought to care
about, because it will save them money in a very direct way.
Anyway, there’s my rant for the day.
Curt S. [email protected] +81 90 7737 2974
The power of accurate observation is commonly called cynicism
by those who have not got it. --George Bernard Shaw
On 30-jan-2006, at 15:13, dseverin wrote:
So, I think, as it will hardly be fixed (YAGNI, men, YAGNI, web 2.0 is
ASCII!) i18n is still in distant perspective.
Julian ‘Julik’ Tarkhanov
me at julik.nl
Rgds,
–Siva J.
http://www.varcasa.com/
My First Rails Project.
Education Through Collabration
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.
Sponsor our Newsletter | Privacy Policy | Terms of Service | Remote Ruby Jobs