pytorch-transformers-classification by ThilinaRajapakse

Deprecated starter for Transformer-based text classification tasks

Created 6 years ago

312 stars

Top 86.5% on SourcePulse

Project Summary

This repository provides a starting point for text classification tasks using HuggingFace's PyTorch-Transformers library, targeting researchers and developers needing to quickly implement BERT, XLNet, RoBERTa, and XLM models. It offers code for training and evaluation, simplifying the process of applying these advanced NLP models.

How It Works

The project leverages the PyTorch-Transformers library to facilitate fine-tuning of various transformer architectures for text classification. It provides pre-written notebooks and scripts that handle data preparation, model training, and evaluation, abstracting away much of the low-level implementation details of the HuggingFace library.

Quick Start & Requirements

Install: conda create -n transformers python pandas tqdm jupyter, conda activate transformers, conda install pytorch cudatoolkit=10.0 -c pytorch (or CPU version), conda install scipy scikit-learn, pip install pytorch-transformers tensorboardX.
Prerequisites: Python, Conda, PyTorch (with CUDA 10.0 recommended), pandas, tqdm, jupyter, scipy, scikit-learn, tensorboardX.
Setup: Requires cloning the repository and potentially downloading datasets (e.g., Yelp Reviews).
Links: Google Colab Notebook (Note: This link points to Simple Transformers, the recommended successor).

Highlighted Details

Supports multiple transformer architectures including BERT, XLNet, RoBERTa, and XLM.
Includes a demo using the Yelp Reviews dataset with data preparation and model training notebooks.
Provides a clear table of supported pretrained models with their specifications.
Outlines the required TSV format for custom datasets and mentions evaluation metrics like confusion matrix and Matthews correlation coefficient.

Maintenance & Community

This repository is deprecated and will not be updated. The author recommends using simpletransformers, a successor library that is actively maintained and easier to use.

Licensing & Compatibility

The repository does not explicitly state a license. However, it is built upon the HuggingFace pytorch-transformers library, which is typically under the Apache 2.0 license. Compatibility with commercial or closed-source projects would depend on the underlying library's license.

Limitations & Caveats

The project is deprecated and may not be compatible with current versions of the HuggingFace Transformers library. Users are strongly advised to migrate to the simpletransformers library for ongoing support and features.

pytorch-transformers-classification by ThilinaRajapakse

Explore Similar Projects

Pytorch-NLU by yongzhuo

nlp_notes by YangBin1729

finetune by IndicoDataSolutions

pytorchic-bert by dhlee347

Bert-Multi-Label-Text-Classification by lonePatient

NLP-Tutorials by MorvanZhou

Transformers-for-Natural-Language-Processing by PacktPublishing

BertSum by nlpyang

learn-nlp-with-transformers by datawhalechina

ABSA-PyTorch by songyouwei

text by pytorch

text_classification by brightmart