PyTorch implementation of OpenAI's Transformer LM
Top 27.9% on sourcepulse
This repository provides a PyTorch implementation of OpenAI's finetuned transformer language model, enabling users to leverage pre-trained weights for language understanding tasks. It's designed for researchers and practitioners familiar with transformer architectures and PyTorch, offering a direct translation of OpenAI's TensorFlow code for easier adoption and experimentation.
How It Works
The implementation closely mirrors OpenAI's original TensorFlow code, including a modified Adam optimizer with fixed weight decay and scheduled learning rates. It provides TransformerModel
and LMHead
classes for language modeling, and ClfHead
for classification tasks, allowing users to add decoders or classifiers on top of the transformer's hidden states.
Quick Start & Requirements
pip install torch
(version >=0.4)tqdm
, sklearn
, spacy
, ftfy
, pandas
encode_dataset()
from utils.py
.__main__
in train.py
.python train.py --dataset rocstories --desc rocstories --submit --analysis --data_dir [path]
Highlighted Details
Maintenance & Community
No specific community links or active maintenance signals are present in the README.
Licensing & Compatibility
The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The implementation is primarily single-GPU focused, limiting batch sizes and potentially impacting accuracy compared to multi-GPU setups. The README does not mention support for newer PyTorch versions or hardware accelerators beyond GPUs.
4 years ago
1 week