LM-BFF  by princeton-nlp

Research paper on few-shot fine-tuning of language models

created 4 years ago
728 stars

Top 48.5% on sourcepulse

GitHubView on GitHub
Project Summary

LM-BFF provides a framework for improving few-shot fine-tuning of pre-trained language models. It targets researchers and practitioners working with limited labeled data, offering techniques to enhance model performance on downstream NLP tasks. The core benefit is achieving better few-shot learning capabilities through structured fine-tuning strategies.

How It Works

LM-BFF combines prompt-based fine-tuning with a novel pipeline for automating prompt generation and a refined strategy for incorporating demonstrations into the model's context. This approach aims to guide the language model more effectively with limited examples, leveraging the power of prompts and relevant demonstrations to improve generalization.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Data preparation: Download datasets to ./data/original and run python tools/generate_k_shot_data.py.
  • Run example: python run.py --task_name SST-2 --data_dir data/k-shot/SST-2/16-42 --model_name_or_path roberta-large --few_shot_type prompt-demo ...
  • Key dependencies: transformers (version 3.4.0 recommended), pytorch.
  • See official quick-start and detailed examples in the README.

Highlighted Details

  • Supports prompt-based fine-tuning, prompt-based fine-tuning with demonstrations, and standard fine-tuning.
  • Includes mechanisms for automatic prompt and label word mapping generation and selection.
  • Offers demonstration filtering using Sentence-BERT embeddings.
  • Supports ensemble methods for improved robustness.
  • Enables zero-shot and GPT-3 style in-context learning experiments.

Maintenance & Community

  • The project is associated with Princeton NLP.
  • For questions or bug reports, contact tianyug@cs.princeton.edu or open an issue on GitHub.

Licensing & Compatibility

  • The README does not explicitly state a license. However, the citation indicates it's from ACL 2021, suggesting academic use. Compatibility for commercial or closed-source linking is not specified.

Limitations & Caveats

  • The README notes that different package versions (e.g., pytorch, transformers) may lead to results deviating from the paper, though trends should persist.
  • Automatic template generation is described as an "extremely long process."
Health Check
Last commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
1 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Travis Fischer Travis Fischer(Founder of Agentic), and
5 more.

setfit by huggingface

0.3%
3k
Few-shot learning framework for Sentence Transformers
created 3 years ago
updated 3 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
8 more.

gpt-3 by openai

0.0%
16k
Research paper on large language model few-shot learning
created 5 years ago
updated 4 years ago
Feedback? Help us improve.