P-tuning-v2  by THUDM

Prompt tuning strategy for comparable performance to fine-tuning

created 3 years ago
2,052 stars

Top 22.1% on sourcepulse

GitHubView on GitHub
Project Summary

P-tuning v2 offers a parameter-efficient prompt tuning strategy that achieves performance comparable to full fine-tuning across various scales and tasks, particularly beneficial for smaller models and challenging sequence tagging tasks. It targets researchers and practitioners in NLP seeking to optimize large language models without extensive computational resources.

How It Works

P-tuning v2 implements deep prompt tuning by applying continuous prompts to every layer's input in a pre-trained transformer. This approach enhances the capacity of continuous prompts, effectively bridging the performance gap with traditional fine-tuning, especially in scenarios with limited data or complex tasks.

Quick Start & Requirements

  • Install: Create a conda environment (conda create -n pt2 python=3.8.5), activate it (conda activate pt2), install PyTorch (conda install -n pt2 pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch), and then install dependencies (pip install -r requirements.txt).
  • Prerequisites: Python 3.8.5, PyTorch 1.7.1, CUDA 11.0, and NVIDIA GPUs (tested on RTX 3090).
  • Data: Datasets for SuperGLUE and SQuAD are available via Huggingface Datasets. Sequence tagging datasets require manual download and extraction.
  • Training: Run training scripts from the run_script directory (e.g., bash run_script/run_rte_roberta.sh).
  • Resources: Reproducing results requires NVIDIA DGX-A100 or multiple RTX 3090 GPUs. Hyperparameter sensitivity suggests potential need for search.

Highlighted Details

  • Comparable performance to fine-tuning across scales and tasks.
  • Deep prompt tuning applied to every layer input.
  • Reimplementation results provided for BERT-large and RoBERTa-large on various NLP benchmarks.
  • Code organization by @rainatam.

Maintenance & Community

The project is associated with THUDM and includes a citation for the ACL 2022 paper "P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks." No specific community channels (Discord/Slack) are mentioned.

Licensing & Compatibility

The repository does not explicitly state a license in the README. This requires further investigation for commercial use or integration into closed-source projects.

Limitations & Caveats

Reproducing exact paper results may be challenging due to environmental and package version sensitivity, necessitating hyperparameter searches. The README notes a difference in experimental setup for SuperGLUE comparisons between v1 and v2 regarding backbone model parameter tuning (frozen in v2).

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
22 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.