bert_seq2seq by 920232796

PyTorch toolkit for sequence-to-sequence and other NLP tasks

Created 5 years ago

1,304 stars

Top 30.5% on SourcePulse

Project Summary

This repository provides a lightweight PyTorch framework for various Natural Language Processing (NLP) tasks, leveraging BERT and similar models. It targets researchers and developers needing a flexible tool for sequence-to-sequence generation (e.g., poetry, summarization), text classification, and sequence labeling (e.g., NER, POS tagging), with support for multiple pre-trained models like BERT, RoBERTa, GPT2, T5, and BART.

How It Works

The framework utilizes a unified approach where different NLP tasks are handled by configuring model architecture and task-specific heads on top of pre-trained encoder models. It supports various pre-trained models by loading their parameters, allowing users to switch between them by setting model_name. Task selection is managed via the model_class parameter, enabling tasks like seq2seq, cls_classifier, sequence_labeling, and sequence_labeling_crf. This modular design simplifies experimentation with different models and tasks.

Quick Start & Requirements

Install via pip: pip install bert-seq2seq tqdm
Requires PyTorch.
Pre-trained model weights need to be downloaded separately from provided links (e.g., Hugging Face, Baidu Pan).
Official examples demonstrate usage for specific tasks.

Highlighted Details

Supports a wide range of NLP tasks including poetry generation, couplet generation, automatic summarization, text classification, sentiment analysis, NER, POS tagging, and relation extraction.
Integrates with popular pre-trained models like BERT, RoBERTa, GPT2, T5, BART, and Huawei's Nezha.
Offers specific implementations like sequence labeling with CRF loss for improved performance.
Includes examples for SimBERT for sentence similarity tasks.

Maintenance & Community

Active development with frequent updates noted in the changelog (last update mentioned: Nov 12, 2021).
QQ group available for community discussion and support (975907202).
Links to personal blog for detailed explanations of tasks and code.

Licensing & Compatibility

The README does not explicitly state a license. Code snippets reference Hugging Face Transformers and bert4keras, which have permissive licenses. However, the absence of a clear license file requires caution for commercial use.

Limitations & Caveats

The project's last update was in late 2021, indicating potential lack of maintenance for newer models or techniques.
Pre-trained model weights must be manually downloaded and configured, adding an extra setup step.
Some specific features like rhyme enforcement in poetry generation were noted as temporarily unsupported in past updates.

Health Check

Last Commit

3 years ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

2 stars in the last 30 days

Explore Similar Projects

Macadam by yongzhuo

NLP tool for text classification, sequence labeling, and relation extraction

Created 5 years ago

Updated 2 years ago

ACE by Alibaba-NLP

Framework for automated embedding concatenation in structured prediction tasks

Created 5 years ago

Updated 3 years ago

Pytorch-NLU by yongzhuo

Pytorch toolkit for text classification, sequence labeling, and text summarization

Created 4 years ago

Updated 1 year ago

NLPGNN by kyzhouhzau

NLP/GNN toolbox for TensorFlow 2.0 implementing various models

Created 5 years ago

Updated 1 year ago

nlp_notes by YangBin1729

NLP notes for ML/DL principles, examples, and model deployment

Created 6 years ago

Updated 5 years ago

Starred by

Robert Stojnic

Robert Stojnic(Cocreator of Papers with Code).

finetune by IndicoDataSolutions

NLP finetuning library with scikit-learn style API

Created 7 years ago

Updated 2 months ago

Unilm by YunwenTechnology

Chinese UniLM base model for NLU and NLG tasks

Created 5 years ago

Updated 3 years ago

nlp-notebook by jasoncao11

NLP toolkit for common tasks, implemented in PyTorch

Created 4 years ago

Updated 2 years ago

nlp-paper by changwookjun

Created 6 years ago

Updated 1 year ago

NLP-Projects by gaoisbest

NLP project collection with concepts and scripts

Created 8 years ago

Updated 5 years ago

Starred by

Luis Capelo

Luis Capelo(Cofounder of Lightning AI),

Eugene Yan

Eugene Yan(AI Scientist at AWS), and

14 more.

text by pytorch

PyTorch library for NLP tasks

Created 9 years ago

Updated 4 months ago

Starred by

Zack Li

Zack Li(Cofounder of Nexa AI),

Andrew Kane

Andrew Kane(Author of pgvector), and

5 more.

text_classification by brightmart

Text classification models using deep learning

Created 8 years ago

Updated 2 years ago

Feedback? Help us improve.