UnicodeUtils implements Unicode algorithms for case conversion,
normalization, text segmentation and more in pure Ruby code.
New in this release:
Added the following UnicodeUtils methods:
- east_asian_width
- display_width
- default_ignorable_char_q
- gc
- graphic_char_q
- general_category
- char_type
- char_display_width
- debug
Usage
Ruby 1.9.1 or higher is required.
$ gem install unicode_utils
require “unicode_utils/display_width”
UnicodeUtils.display_width(“Matz$B$K$C$-(B”) # => 10
$ irb -r unicode_utils/u
irb(main):001:0> U.debug(“Matz$B$K$C$-(B”)
Char | Ordinal | Name | General Category | UTF-8
------±--------±-------------------------±-----------------±---------
“M” | 4D | LATIN CAPITAL LETTER M | Uppercase_Letter | 4D
“a” | 61 | LATIN SMALL LETTER A | Lowercase_Letter | 61
“t” | 74 | LATIN SMALL LETTER T | Lowercase_Letter | 74
“z” | 7A | LATIN SMALL LETTER Z | Lowercase_Letter | 7A
“$B$K(B” | 306B | HIRAGANA LETTER NI | Other_Letter | E3
81 AB
“$B$C(B” | 3063 | HIRAGANA LETTER SMALL TU | Other_Letter | E3
81 A3
“$B$-(B” | 304D | HIRAGANA LETTER KI | Other_Letter | E3
81 8D
Documentation & Source
http://unicode-utils.rubyforge.org
GitHub - lang/unicode_utils: Unicode algorithms for Ruby 1.9
Issues
It should work on all Ruby 1.9.1 implementations or higher
independently of operating system. If not, please report
it on github.