karukan  by togatoga

Neural Japanese Kana-Kanji conversion for Linux input systems

Created 1 year ago
262 stars

Top 97.0% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Karukan provides a Japanese Input Method System (IME) for Linux, featuring a neural Kana-Kanji conversion engine integrated with fcitx5. It targets Linux users seeking advanced, context-aware Japanese text input, offering benefits through its AI-powered conversion and learning capabilities.

How It Works

The core karukan-engine library performs Romaji-to-Hiragana conversion and utilizes a GPT-2 based model, run via llama.cpp, for neural Kana-Kanji conversion. This approach allows for context-aware text processing, considering surrounding text for more accurate suggestions. The system also incorporates conversion learning, remembering user selections to prioritize them in future conversions and offering predictive (prefix-matching) suggestions.

Quick Start & Requirements

Installation instructions are detailed in the karukan-im README. A key requirement is the initial download of the AI model from Hugging Face upon first launch, which may cause a delay before the first conversion can begin. Subsequent uses leverage the cached model.

Highlighted Details

  • Neural Conversion: Employs GPT-2 model inference through llama.cpp for sophisticated Japanese text conversion.
  • Context Awareness: Conversion accuracy is enhanced by considering surrounding text.
  • User Learning: Learns user conversion preferences and supports predictive, prefix-matching suggestions.
  • System Dictionary: System dictionaries are constructed from SudachiDict data.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or roadmap were provided in the README excerpt.

Licensing & Compatibility

The project is offered under a dual license: MIT OR Apache-2.0. These permissive licenses generally allow for broad compatibility, including commercial use and linking within closed-source projects.

Limitations & Caveats

The initial setup requires a potentially lengthy download of the AI model. Installation details are deferred to a separate README file (karukan-im), potentially adding an extra step for users.

Health Check
Last Commit

4 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
1
Star History
36 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.