UnicodeUtils 1.3.0 - case conversion, normalization and more

UnicodeUtils - Unicode algorithms in pure Ruby code.

New in this release:

Updated to Unicode 6.1.0.

New methods: code_point_type, name_aliases, sid (string identifier)

New constants: UNICODE_VERSION

Usage

Ruby 1.9.1 or higher is required.

$ gem install unicode_utils

require “unicode_utils/display_width”
UnicodeUtils.display_width(“にっき”) # => 6

$ irb -r unicode_utils/u

irb(main):001:0> U.sid 0xfeff
=> “BYTE ORDER MARK”

irb(main):002:0> U.debug “Matz\u{2029}にっき”
Char | Ordinal | Sid | General Category |
UTF-8
------±--------±-------------------------±--------------------±---------
“M” | 4D | LATIN CAPITAL LETTER M | Uppercase_Letter | 4D
“a” | 61 | LATIN SMALL LETTER A | Lowercase_Letter | 61
“t” | 74 | LATIN SMALL LETTER T | Lowercase_Letter | 74
“z” | 7A | LATIN SMALL LETTER Z | Lowercase_Letter | 7A
N/A | 2029 | PARAGRAPH SEPARATOR | Paragraph_Separator | E2
80 A9
“に” | 306B | HIRAGANA LETTER NI | Other_Letter | E3
81 AB
“っ” | 3063 | HIRAGANA LETTER SMALL TU | Other_Letter | E3
81 A3
“き” | 304D | HIRAGANA LETTER KI | Other_Letter | E3
81 8D
=> nil

irb(main):003:0> U.casefold(“Straße”) == U.casefold(“STRASSE”)
=> true

irb(main):004:0> U.titlecase(“istanbul”, :tr)
=> “İstanbul”

irb(main):005:0> U.nfkc “finland”
=> “finland”

Documentation & Source

http://unicode-utils.rubyforge.org
GitHub - lang/unicode_utils: Unicode algorithms for Ruby 1.9

Issues

It should work on all Ruby 1.9.1 implementations or higher
independently of operating system. If not, please report
it on Issues · lang/unicode_utils · GitHub