t-few  by r-three

Code for parameter-efficient fine-tuning research paper

created 3 years ago
456 stars

Top 67.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official code for the T-Few paper, focusing on parameter-efficient fine-tuning (PEFT) for few-shot learning tasks. It aims to offer a more effective and cost-efficient alternative to in-context learning, achieving state-of-the-art results on benchmarks like RAFT. The target audience includes researchers and practitioners in NLP and machine learning who need to adapt large language models to new tasks with limited data.

How It Works

T-Few implements parameter-efficient fine-tuning techniques, specifically focusing on methods that modify only a small subset of model parameters. This approach contrasts with full fine-tuning and in-context learning, offering a balance between performance and computational cost. The method is designed to be more stable and achieve better generalization than in-context learning, particularly in low-data regimes.

Quick Start & Requirements

  • Install: Create a conda environment with Python 3.7 (conda create -n tfew python==3.7), activate it (conda activate tfew), and install dependencies (pip install -r requirements.txt -f https://download.pytorch.org/whl/cu113/torch_stable.html).
  • Prerequisites: CUDA 11.3, Python 3.7, PyTorch. For SAID, run python src/intrinsic_said_setup.py develop.
  • Execution: Run experiments using CUDA_VISIBLE_DEVICES=<gpu_id> python -m src.pl_train -c <config_file1>.json+<config_file2>.json -k <key>=<value> exp_name=<experiment_name>.
  • Resources: Recommended GPUs: 40GB for T0(3B), 80GB for T0.
  • Docs: Configuration and execution details are provided within the README.

Highlighted Details

  • Outperforms in-context learning with GPT-3.
  • Achieves state-of-the-art on the RAFT benchmark.
  • Supports combining multiple configuration files for flexible experiment setup.
  • Includes scripts for running arrays of experiments and generating results tables.

Maintenance & Community

The paper's authors are affiliated with Google and other institutions. No specific community channels (Discord, Slack) or active development signals are mentioned in the README.

Licensing & Compatibility

The repository does not explicitly state a license. The included citations are for academic purposes, and commercial use would require clarification of licensing terms.

Limitations & Caveats

The setup requires a specific older version of Python (3.7) and CUDA (11.3), which may pose compatibility challenges with newer hardware and software stacks. The README does not detail ongoing maintenance or community support.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
10 more.

open-r1 by huggingface

0.2%
25k
SDK for reproducing DeepSeek-R1
created 6 months ago
updated 3 days ago
Feedback? Help us improve.