fancy-nlp  by boat-group

NLP toolkit for rapid prototyping and deployment

created 6 years ago
284 stars

Top 93.1% on sourcepulse

GitHubView on GitHub
Project Summary

Fancy-NLP is a Python toolkit designed for efficient and user-friendly Natural Language Processing (NLP) tasks, particularly for Chinese text. It aims to simplify the process of implementing NLP solutions, allowing users to quickly leverage pre-trained models or customize their own for applications like entity extraction, text classification, and sentence similarity matching, benefiting both novice and advanced users in business scenarios.

How It Works

Fancy-NLP provides a high-level, application-oriented interface that abstracts away complex preprocessing and model deployment steps. It utilizes TensorFlow 2.x and supports various model architectures (e.g., BiLSTM-CNN, Siamese CNN) and integrates with BERT models for enhanced performance. The toolkit emphasizes ease of use, enabling one-click installation and straightforward application of pre-trained models for common NLP tasks.

Quick Start & Requirements

  • Install via pip: pip install fancy-nlp
  • Requires Python 3.6+ and TensorFlow 2.x.
  • Pre-trained models are downloaded on first run.
  • Detailed tutorials and example code are available for custom model training and BERT integration.

Highlighted Details

  • Supports Named Entity Recognition (NER), Text Classification, and Sentence Pair Matching (SPM).
  • Offers seamless integration with pre-trained BERT models for fine-tuning or feature extraction.
  • Provides utilities for data loading and model saving/loading.
  • Achieved top rankings and a "Technical Innovation Award" in the CCKS 2019 competition for Chinese short text entity linking.

Maintenance & Community

The project originated from a Tencent advertising research initiative and involves contributors from Tencent and Tongji University. It received an award in the 2019 Tencent AI Code Culture Festival. Contribution guidelines follow PEP8 and Conventional Commits.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify a license, which may impact commercial adoption. While it supports BERT integration, it notes that BERT models can only be used with character vectors, not word vectors, and requires careful configuration of learning rates when fine-tuning.

Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.