Discover and explore top open-source AI tools and projects—updated daily.
LlmKiraFast language detection powered by FastText
Top 99.1% on SourcePulse
fast-langdetect offers an ultra-fast, highly accurate language detection library based on Facebook's FastText. It targets developers needing efficient language identification, providing significant speedups and offline capabilities suitable for high-throughput applications and resource-constrained environments.
How It Works
Leveraging pre-trained FastText models, the library achieves up to 95% accuracy. It provides a memory-friendly 'lite' model for offline use (~45-60 MB RSS) and a more accurate 'full' model (~170-210 MB RSS). An 'auto' mode intelligently falls back to 'lite' upon MemoryError during full model loading.
Quick Start & Requirements
pip install fast-langdetectFTLANG_CACHE or LangDetectConfig(cache_dir=...).Highlighted Details
langcodes/pycountry).Maintenance & Community
Builds upon zafercavdar/fasttext-langdetect with packaging enhancements. Mentions contributions from @dalf and github@JackyHe398. No specific community channels or active maintenance signals are detailed.
Licensing & Compatibility
Limitations & Caveats
Accuracy may decrease for very short or excessively long inputs (default max_input_length is 80 chars, truncation logs a warning). 'Auto' mode fallback is solely triggered by MemoryError; other errors propagate. User-provided cache directories must exist beforehand.
1 month ago
Inactive
kensho-technologies