Text-hyphen 1.0.2 Released

text-hyphen version 1.0.2 has been released!

Text::Hyphen will hyphenate words using modified versions of TeX
hyphenation patterns.

NOTE:
This version is NOT compatible with Ruby 1.9. Text::Hyphen version 2
(which will be started soon) will be converting 100% to UTF-8 and will
not have compatibility with Ruby 1.8.

Text::Hyphen will properly hyphenate various words according to the
rules of the language the word is written in. The algorithm is based on
that of the TeX typesetting system by Donald E. Knuth. This is based on
the Perl implementation of TeX::Hyphen[1] and the Ruby port[2]. The
language hyphenation pattern files are based on the sources available
from CTAN[3] as of 2004.12.19 and have been translated by Austin
Ziegler.

This release is 1.0.2. It is a minor bugfix for the RubyGem release of
Text::Hyphen to enable the hyphen command-line program. Text::Hyphen
represents a significant improvement over its predecessor, TeX::Hyphen.

Synopsis:

require 'text/hyphen'
hh = Text::Hyphen.new(:language => 'en_us', :left => 2, :right => 2)
# Defaults to the above
hh = TeX::Hyphen.new

word = "representation"
points = hyp.hyphenate(word)  #=> [3, 5, 8, 10]
puts hyp.visualize(word)      #=> rep-re-sen-ta-tion

Text::Hyphen is truly multilingual[4]. As an example, consider the
difference
between the following:

require 'text/hyphen'
# Using left and right minimum values of 0 ensures that you will
# see all possible hyphenation points, not just those that meet
# the minimum width requirements.
en = Text::Hyphen.new(:left => 0, :right => 0)
fr = Text::Hyphen.new(:language = "fr", :left => 0, :right => 0)

puts en.visualise("organiser")      #=> or-gan-iser
puts fr.visualise("organiser")      #=> or-ga-ni-ser

As you can see, the hyphenation is distinct between the two hyphenators.
Additional improvements over TeX::Hyphen include thread safety (except
for
debug control) and (minimal) support for UTF-8.

Bugs should be reported on the RubyForge project or on my GitHub
repository.
I do not regularly monitor ruby-talk.

Changes:

1.0.2 / unreleased

  • Moved to ‘hoe’ and GitHub.
  • Preparing for 2.0 which will be Ruby 1.9-only for UTF-8.
  • Fixing German support (RubyForge 28498):
    • Choosing ‘de’ as a language will load ‘de1’. Choosing ‘de1’ or ‘de2’
      will
      load properly now, but they will be reported with an ISO language
      code of
      ‘de’ (new optional #isocode attribute on a language definition that
      will
      override the #iso_language setting of a Text::Hyphen instance if
      set).
    • Both ‘de1’ and ‘de2’ can be loaded simultaneously now, but the first
      one
      loaded will claim the Text::Hyphen::Language::DE constant.
  • Added test cases for bugs:
    • RubyForge 9807 (cannot reproduce)
    • RubyForge 28128 (cannot reproduce)
    • RubyForge 28498

Very Nice! Is there anywhere a roadmap for 2.0?