UnicodeUtils 1.2.2 - case conversion, normalization and more

UnicodeUtils implements Unicode algorithms for case conversion,
normalization, text segmentation and more in pure Ruby code.

New in this release:

Added the following UnicodeUtils methods:

  • east_asian_width
  • display_width
  • default_ignorable_char_q
  • gc
  • graphic_char_q
  • general_category
  • char_type
  • char_display_width
  • debug

Usage

Ruby 1.9.1 or higher is required.

$ gem install unicode_utils

require “unicode_utils/display_width”
UnicodeUtils.display_width(“Matz$B$K$C$-(B”) # => 10

$ irb -r unicode_utils/u
irb(main):001:0> U.debug(“Matz$B$K$C$-(B”)
Char | Ordinal | Name | General Category | UTF-8
------±--------±-------------------------±-----------------±---------
“M” | 4D | LATIN CAPITAL LETTER M | Uppercase_Letter | 4D
“a” | 61 | LATIN SMALL LETTER A | Lowercase_Letter | 61
“t” | 74 | LATIN SMALL LETTER T | Lowercase_Letter | 74
“z” | 7A | LATIN SMALL LETTER Z | Lowercase_Letter | 7A
“$B$K(B” | 306B | HIRAGANA LETTER NI | Other_Letter | E3
81 AB
“$B$C(B” | 3063 | HIRAGANA LETTER SMALL TU | Other_Letter | E3
81 A3
“$B$-(B” | 304D | HIRAGANA LETTER KI | Other_Letter | E3
81 8D

Documentation & Source

http://unicode-utils.rubyforge.org
GitHub - lang/unicode_utils: Unicode algorithms for Ruby 1.9

Issues

It should work on all Ruby 1.9.1 implementations or higher
independently of operating system. If not, please report
it on github.

2011/11/28 Stefan L. [email protected]:

UnicodeUtils implements Unicode algorithms for case conversion,
normalization, text segmentation and more in pure Ruby code.

uh, very handy. thanks stefan
kind regards -botp