naacl_transfer_learning_tutorial  by huggingface

NLP transfer learning tutorial code

Created 6 years ago
721 stars

Top 47.7% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides code for a 2019 NAACL tutorial on transfer learning in NLP, targeting researchers and practitioners. It offers a simplified, self-contained implementation of key transfer learning techniques, enabling users to understand and experiment with pre-training and fine-tuning transformer models for NLP tasks.

How It Works

The codebase implements a GPT-2-like transformer architecture for pre-training on large datasets (WikiText-103, SimpleBooks-92) using a language modeling objective. It then provides scripts for fine-tuning this pre-trained model on downstream tasks like text classification (IMDb), incorporating architectural variations such as adapters. The design prioritizes ease of use and understanding over state-of-the-art performance.

Quick Start & Requirements

  • Install via pip install -r requirements.txt after cloning the repository.
  • Requires Python and PyTorch. Pre-training utilizes distributed training capabilities.
  • Pre-training on WikiText-103 with 8 V100 GPUs takes approximately 15 hours to reach a validation perplexity of ~29.
  • Tutorial slides: https://tinyurl.com/NAACLTransfer
  • Colab notebook: https://tinyurl.com/NAACLTransferColab

Highlighted Details

  • Implements a GPT-2-like transformer for pre-training.
  • Includes scripts for both pre-training (language modeling) and fine-tuning (classification).
  • Supports distributed training for pre-training and fine-tuning.
  • Offers fine-tuning architectures with classification heads and adapters.

Maintenance & Community

  • Tutorial presented by Sebastian Ruder, Matthew Peters, Swabha Swayamdipta, and Thomas Wolf.
  • No explicit community links (Discord/Slack) or roadmap are provided in the README.

Licensing & Compatibility

  • The repository itself is not explicitly licensed in the README. The underlying Hugging Face libraries typically use Apache 2.0, but this specific tutorial repo's license is unstated.

Limitations & Caveats

The code is designed for educational purposes and does not aim for state-of-the-art performance, with pre-training perplexity being higher than comparable models. The tutorial is from 2019, and the NLP transfer learning landscape has evolved significantly since then.

Health Check
Last Commit

6 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Coauthor of SGLang), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
10 more.

adapters by adapter-hub

0.2%
3k
Unified library for parameter-efficient transfer learning in NLP
Created 5 years ago
Updated 1 month ago
Feedback? Help us improve.