unlikelihood_training by facebookresearch

PyTorch code for neural text generation research

Created 6 years ago

310 stars

Top 86.9% on SourcePulse

View on GitHub

1 Expert Loves This Project

Thomas Wolf

Cofounder of Hugging Face

Project Summary

This repository provides a PyTorch implementation of unlikelihood training for neural text generation, as described in the paper "Neural Text Generation with Unlikelihood Training." It offers tools for researchers and practitioners to fine-tune language models to avoid generating undesirable tokens or sequences, leading to more controlled and higher-quality text outputs.

How It Works

The core of the implementation is a custom fairseq module that integrates unlikelihood training objectives. This approach modifies the standard maximum likelihood estimation (MLE) training by adding a penalty term that discourages the model from assigning high probability to specific "unlikely" tokens or sequences. This allows for fine-grained control over generation, such as reducing repetition or avoiding specific phrases.

Quick Start & Requirements

Install fairseq: git clone https://github.com/pytorch/fairseq.git && cd fairseq && git checkout 2b68e91f231a2b7997664e1418f30b808d889963 && pip install --editable .
Other dependencies: pip install nltk pandas pytorch-transformers tensorflow=1.14 tensorboardX torch==1.4.0
Integration: Copy the custom directory from this repo into your fairseq installation.
Dataset: Download and unpack wikitext-103_v0.tar.gz.
Pre-trained models: Download and unpack checkpoints_v0.tar.gz (16GB).
Hardware: Tested with Tesla V100 GPUs.
Docs: fairseq

Highlighted Details

Implements both token-level and sequence-level unlikelihood training.
Includes scripts for fine-tuning GPT-2 models using unlikelihood objectives.
Provides evaluation scripts with various decoding strategies (greedy, beam search, top-k, top-p).
Offers pre-trained models and evaluation results for comparison.

Maintenance & Community

This project is from Facebook AI Research (FAIR). Specific community channels are not explicitly mentioned in the README.

Licensing & Compatibility

License: CC-BY-NC 4.0 (Creative Commons Attribution-NonCommercial 4.0 International).
Restrictions: Non-commercial use only.

Limitations & Caveats

The CC-BY-NC 4.0 license restricts commercial use. The setup requires specific versions of PyTorch (1.4.0) and TensorFlow (1.14), which may conflict with other projects. The large download size for pre-trained models (16GB) is also a consideration.

Health Check

Last Commit

4 years ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days