Good News: Performance improvement.
Bad News: Memory leak problem still exists.
rmmseg version 0.1.2
RMMSeg is an implementation of MMSEG Chinese word segmentation
algorithm. It is based on two variants of maximum matching
algorithms. Two algorithms are available for using:
- simple algorithm that uses only forward maximum matching.
- complex algorithm that uses three-word chunk maximum matching and 3
aditonal rules to solve ambiguities.
For more information about the algorithm, please refer to the
- MMSEG: A Word Identification System for Mandarin Chinese Text Based on Two Variants of the Maximum Matching Algorithm
- LifeGoo.com is for sale | HugeDomains
- Add cache to find_match_words: performance improved.
- Implement Chunk as a module instead of a class: performance improved.
- Don’t store unnecessary data in dictionary: memory usage reduced.