Pytorch-NLU  by yongzhuo

Pytorch toolkit for text classification, sequence labeling, and text summarization

Created 4 years ago
349 stars

Top 79.6% on SourcePulse

GitHubView on GitHub
Project Summary

This toolkit provides a minimalist, PyTorch-based solution for Chinese Natural Language Understanding tasks, specifically text classification and sequence labeling. It supports a wide array of pre-trained models and loss functions, making it suitable for researchers and developers working with Chinese NLP data who need a flexible and well-annotated codebase.

How It Works

The library leverages the PyTorch ecosystem, integrating seamlessly with Hugging Face's transformers library to support models like BERT, ERNIE, RoBERTa, and others. It offers a variety of loss functions, including BCE, Focal Loss, Circle Loss, and Label Smoothing, allowing users to fine-tune model performance based on specific task requirements. The architecture is designed for simplicity, clarity, and ease of extension.

Quick Start & Requirements

  • Install via pip: pip install Pytorch-NLU or pip install -i https://pypi.tuna.tsinghua.edu.cn/simple Pytorch-NLU
  • Requires PyTorch, transformers, numpy, and tensorboardX.
  • Supports various pre-trained models, requiring download or local path configuration.

Highlighted Details

  • Supports 10+ pre-trained models including BERT, ERNIE, RoBERTa, ALBERT, XLNET, ELECTRA, GPT-2, TinyBERT, XLM, T5.
  • Implements 6+ loss functions such as BCE, Focal Loss, Circle Loss, Prior Loss, Dice Loss, and Label Smoothing.
  • Offers functionalities for multi-class, multi-label classification, Named Entity Recognition (NER), Part-of-Speech (POS) tagging, word segmentation, and extractive text summarization.
  • Provides extensive datasets for text classification and sequence labeling tasks.

Maintenance & Community

The project is maintained by Yongzhuo Mo. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The repository does not explicitly state a license. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

The README does not specify a license, which could be a barrier for commercial adoption. Some example configurations point to local Windows paths (D:/pretrain_models/pytorch), suggesting potential cross-platform setup nuances.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Luis Capelo Luis Capelo(Cofounder of Lightning AI), Eugene Yan Eugene Yan(AI Scientist at AWS), and
14 more.

text by pytorch

0.0%
4k
PyTorch library for NLP tasks
Created 8 years ago
Updated 1 week ago
Feedback? Help us improve.