naacl_transfer_learning_tutorial  by huggingface

NLP transfer learning tutorial code

created 6 years ago
723 stars

Top 48.6% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides code for a 2019 NAACL tutorial on transfer learning in NLP, targeting researchers and practitioners. It offers a simplified, self-contained implementation of key transfer learning techniques, enabling users to understand and experiment with pre-training and fine-tuning transformer models for NLP tasks.

How It Works

The codebase implements a GPT-2-like transformer architecture for pre-training on large datasets (WikiText-103, SimpleBooks-92) using a language modeling objective. It then provides scripts for fine-tuning this pre-trained model on downstream tasks like text classification (IMDb), incorporating architectural variations such as adapters. The design prioritizes ease of use and understanding over state-of-the-art performance.

Quick Start & Requirements

  • Install via pip install -r requirements.txt after cloning the repository.
  • Requires Python and PyTorch. Pre-training utilizes distributed training capabilities.
  • Pre-training on WikiText-103 with 8 V100 GPUs takes approximately 15 hours to reach a validation perplexity of ~29.
  • Tutorial slides: https://tinyurl.com/NAACLTransfer
  • Colab notebook: https://tinyurl.com/NAACLTransferColab

Highlighted Details

  • Implements a GPT-2-like transformer for pre-training.
  • Includes scripts for both pre-training (language modeling) and fine-tuning (classification).
  • Supports distributed training for pre-training and fine-tuning.
  • Offers fine-tuning architectures with classification heads and adapters.

Maintenance & Community

  • Tutorial presented by Sebastian Ruder, Matthew Peters, Swabha Swayamdipta, and Thomas Wolf.
  • No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • The repository itself is not explicitly licensed in the README. The underlying Hugging Face libraries typically use Apache 2.0, but this specific tutorial repo's license is unstated.

Limitations & Caveats

The code is designed for educational purposes and does not aim for state-of-the-art performance, with pre-training perplexity being higher than comparable models. The tutorial is from 2019, and the NLP transfer learning landscape has evolved significantly since then.

Health Check
Last commit

5 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.