naacl_transfer_learning_tutorial by huggingface

NLP transfer learning tutorial code

Created 6 years ago

721 stars

Top 47.8% on SourcePulse

View on GitHub

7 Experts Love This Project

Clement Delangue

Cofounder of Hugging Face

Yaowei Zheng

Author of LLaMA-Factory

Julien Chaumond

Cofounder of Hugging Face

Jeff Hammerbacher

Cofounder of Cloudera

and 3 more!

Project Summary

This repository provides code for a 2019 NAACL tutorial on transfer learning in NLP, targeting researchers and practitioners. It offers a simplified, self-contained implementation of key transfer learning techniques, enabling users to understand and experiment with pre-training and fine-tuning transformer models for NLP tasks.

How It Works

The codebase implements a GPT-2-like transformer architecture for pre-training on large datasets (WikiText-103, SimpleBooks-92) using a language modeling objective. It then provides scripts for fine-tuning this pre-trained model on downstream tasks like text classification (IMDb), incorporating architectural variations such as adapters. The design prioritizes ease of use and understanding over state-of-the-art performance.

Quick Start & Requirements

Install via pip install -r requirements.txt after cloning the repository.
Requires Python and PyTorch. Pre-training utilizes distributed training capabilities.
Pre-training on WikiText-103 with 8 V100 GPUs takes approximately 15 hours to reach a validation perplexity of ~29.
Tutorial slides: https://tinyurl.com/NAACLTransfer
Colab notebook: https://tinyurl.com/NAACLTransferColab

Highlighted Details

Implements a GPT-2-like transformer for pre-training.
Includes scripts for both pre-training (language modeling) and fine-tuning (classification).
Supports distributed training for pre-training and fine-tuning.
Offers fine-tuning architectures with classification heads and adapters.

Maintenance & Community

Tutorial presented by Sebastian Ruder, Matthew Peters, Swabha Swayamdipta, and Thomas Wolf.
No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

The repository itself is not explicitly licensed in the README. The underlying Hugging Face libraries typically use Apache 2.0, but this specific tutorial repo's license is unstated.

Limitations & Caveats

The code is designed for educational purposes and does not aim for state-of-the-art performance, with pre-training perplexity being higher than comparable models. The tutorial is from 2019, and the NLP transfer learning landscape has evolved significantly since then.

Health Check

Last Commit

6 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days