Multilingual NLP library for research/industry, built on PyTorch and TensorFlow
Top 0.9% on sourcepulse
HanLP is a comprehensive multilingual NLP library designed for researchers and enterprises, offering advanced deep learning techniques for tasks like tokenization, POS tagging, NER, and dependency parsing across 130 languages. It provides both lightweight RESTful APIs for agile development and native Python APIs for deeper integration, aiming to deliver state-of-the-art performance efficiently and with ease of use.
How It Works
HanLP leverages PyTorch and TensorFlow 2.x, building upon open-access corpora like Universal Dependencies and OntoNotes. It supports multi-task learning (MTL) for joint task performance and offers mono-lingual models that often outperform multilingual ones for specific languages. The library emphasizes reproducibility, guaranteeing that reported scores can be replicated.
Quick Start & Requirements
pip install hanlp_restful
pip install hanlp
(Requires Python 3.6+)Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
Multi-task learning models may underperform single-task models, and mono-lingual models generally outperform multilingual ones. Users targeting high accuracy should prioritize single-task mono-lingual models.
2 months ago
Inactive