HanLP  by hankcs

Multilingual NLP library for research/industry, built on PyTorch and TensorFlow

Created 11 years ago
35,642 stars

Top 0.9% on SourcePulse

GitHubView on GitHub
Project Summary

HanLP is a comprehensive multilingual NLP library designed for researchers and enterprises, offering advanced deep learning techniques for tasks like tokenization, POS tagging, NER, and dependency parsing across 130 languages. It provides both lightweight RESTful APIs for agile development and native Python APIs for deeper integration, aiming to deliver state-of-the-art performance efficiently and with ease of use.

How It Works

HanLP leverages PyTorch and TensorFlow 2.x, building upon open-access corpora like Universal Dependencies and OntoNotes. It supports multi-task learning (MTL) for joint task performance and offers mono-lingual models that often outperform multilingual ones for specific languages. The library emphasizes reproducibility, guaranteeing that reported scores can be replicated.

Quick Start & Requirements

  • RESTful API: pip install hanlp_restful
  • Native Python API: pip install hanlp (Requires Python 3.6+)
  • Hardware: GPU/TPU acceleration recommended but not mandatory.
  • Documentation: docs

Highlighted Details

  • Supports 10 joint NLP tasks across 130 languages.
  • Offers both multilingual and superior mono-lingual models.
  • Guarantees reproducible performance scores.
  • Includes functionality to train custom models.

Maintenance & Community

  • Active development with a focus on reproducibility.
  • Community forum available.

Licensing & Compatibility

  • Library licensed under Apache License 2.0, allowing commercial use.
  • Models are licensed under CC BY-NC-SA 4.0, restricting commercial use.

Limitations & Caveats

Multi-task learning models may underperform single-task models, and mono-lingual models generally outperform multilingual ones. Users targeting high accuracy should prioritize single-task mono-lingual models.

Health Check
Last Commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
1
Star History
156 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
1 more.

spark-nlp by JohnSnowLabs

0.0%
4k
NLP library for scalable ML pipelines
Created 8 years ago
Updated 3 days ago
Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Eugene Yan Eugene Yan(AI Scientist at AWS), and
14 more.

text by pytorch

0.0%
4k
PyTorch library for NLP tasks
Created 8 years ago
Updated 1 week ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), François Chollet François Chollet(Author of Keras; Cofounder of Ndea, ARC Prize), and
42 more.

spaCy by explosion

0.1%
32k
NLP library for production applications
Created 11 years ago
Updated 3 months ago
Feedback? Help us improve.