JS library for Chinese/Japanese morphological analysis (segmentation + PoS tagging)
Top 65.5% on sourcepulse
Rakuten MA is a pure JavaScript library for morphological analysis (word segmentation and Part-of-Speech tagging) of Chinese and Japanese text. It is designed for both browser and Node.js environments, offering online learning capabilities for model updates and various optimizations for compact model representation, making it suitable for web-based NLP applications.
How It Works
Rakuten MA implements a language-independent character tagging model using the Soft Confidence Weighted (SCW) learning algorithm. It supports customizable feature sets, including character unigrams, bigrams, and character type features, with optional feature hashing and quantization for model size reduction. The library allows for incremental training, enabling users to adapt pre-trained models or build new ones from scratch.
Quick Start & Requirements
npm install rakutenma
require('rakutenma')
in Node.js or by including rakutenma.js
in HTML for browser use.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
6 years ago
Inactive