ICU4R 0.1.0 - initial release


#1

==ICU4R v.0.1.0 - initial release ==

= Abstract

ICU4R is an attempt to provide better Unicode support for Ruby, based
on ICU library.

Project Site: http://rubyforge.org/projects/icu4r/

Download: http://rubyforge.org/frs/download.php/8116/icu4r-0.1.0.tar.gz

RDoc: http://icu4r.rubyforge.org/

= Install Notes

To build ICU4R you’ll need GCC and ICU v3.4 libraries, which can be
downloaded from
http://ibm.com/software/globalization/icu/downloads.jsp

Build and install:
ruby extconf.rb && make && make check && make install

= Features

ICU4R is Ruby C-extension binding for ICU library.
It is NOT mirroring full ICU object hierarchy, but is rather set of
simple
interfaces for some practically useful functionality, and provides:

- UString : String-like class with internal UTF16 storage;
- UCA rules for UString comparisons (<=>, casecmp);
- Unicode regular expressions;
- encoding(codepage) conversion;
- Unicode normalization;
- access to resource bundles, including ICU locale data;
- transliteration, also rule-based;

Bunch of locale-sensitive functions:
- upcase/downcase;
- string collation;
- string search;
- iterators over text line/word/char/sentence breaks;
- message formatting (number/currency/string/time);
- date and number parsing.

== DISCLAIMER ==

The code is slow and inefficient yet, can have many security and memory
leaks,
bugs, inconsistent documentation, incomplete test suite. Use it at
your own risk.

Critics, bug reports, feature requests are welcome :slight_smile:

WBR, Nikolai L. removed_email_address@domain.invalid


#2

On 1/19/06, Lugovoi N. removed_email_address@domain.invalid wrote:

- Unicode normalization;

Great work. I’ll check out next week.


#3

On 1/19/06, Lugovoi N. removed_email_address@domain.invalid wrote:

==ICU4R v.0.1.0 - initial release ==

= Abstract

ICU4R is an attempt to provide better Unicode support for Ruby, based
on ICU library.
What are we missing in Ruby now?

= Features

ICU4R is Ruby C-extension binding for ICU library.
It is NOT mirroring full ICU object hierarchy, but is rather set of simple
interfaces for some practically useful functionality, and provides:

- UString : String-like class with internal UTF16 storage;

What is that cool about UTF16? You still can get multiword charactres
but the encoding is no longer byte-order independent.

- UCA rules for UString comparisons (<=>, casecmp);
- Unicode regular expressions;

I guess we do not have this in Ruby yet but I never tried :slight_smile:

- encoding(codepage) conversion;

I thought this is there somewhere - some iconv thingy or something.

- Unicode normalization;
- access to resource bundles, including ICU locale data;
- transliteration, also rule-based;

Wow, does this mean I could read Russian in Latin characters? That way
I could probably understand about half of it :slight_smile:

Bunch of locale-sensitive functions:
- upcase/downcase;
- string collation;
- string search;
- iterators over text line/word/char/sentence breaks;
- message formatting (number/currency/string/time);
- date and number parsing.

I suspect there are poeple who use this - I only recently switched to
en_US.UTF-8 locale from C so that I can read some funny charaters and
still enjoy interfaces not clobbered by translation :slight_smile:

It looks like some features can be useful. I should try it when I get
to something that needs some of those funny characters.

Thanks

Michal


Support the freedom of music!
Maybe it’s a weird genre … but weird is not illegal.
Maybe next time they will send a special forces commando
to your picnic … because they think you are weird.
www.music-versus-guns.org http://en.policejnistat.cz