LM-BFF  by princeton-nlp

Research paper on few-shot fine-tuning of language models

Created 4 years ago
728 stars

Top 47.4% on SourcePulse

GitHubView on GitHub
Project Summary

LM-BFF provides a framework for improving few-shot fine-tuning of pre-trained language models. It targets researchers and practitioners working with limited labeled data, offering techniques to enhance model performance on downstream NLP tasks. The core benefit is achieving better few-shot learning capabilities through structured fine-tuning strategies.

How It Works

LM-BFF combines prompt-based fine-tuning with a novel pipeline for automating prompt generation and a refined strategy for incorporating demonstrations into the model's context. This approach aims to guide the language model more effectively with limited examples, leveraging the power of prompts and relevant demonstrations to improve generalization.

Quick Start & Requirements

  • Install dependencies: pip install -r requirements.txt
  • Data preparation: Download datasets to ./data/original and run python tools/generate_k_shot_data.py.
  • Run example: python run.py --task_name SST-2 --data_dir data/k-shot/SST-2/16-42 --model_name_or_path roberta-large --few_shot_type prompt-demo ...
  • Key dependencies: transformers (version 3.4.0 recommended), pytorch.
  • See official quick-start and detailed examples in the README.

Highlighted Details

  • Supports prompt-based fine-tuning, prompt-based fine-tuning with demonstrations, and standard fine-tuning.
  • Includes mechanisms for automatic prompt and label word mapping generation and selection.
  • Offers demonstration filtering using Sentence-BERT embeddings.
  • Supports ensemble methods for improved robustness.
  • Enables zero-shot and GPT-3 style in-context learning experiments.

Maintenance & Community

  • The project is associated with Princeton NLP.
  • For questions or bug reports, contact tianyug@cs.princeton.edu or open an issue on GitHub.

Licensing & Compatibility

  • The README does not explicitly state a license. However, the citation indicates it's from ACL 2021, suggesting academic use. Compatibility for commercial or closed-source linking is not specified.

Limitations & Caveats

  • The README notes that different package versions (e.g., pytorch, transformers) may lead to results deviating from the paper, though trends should persist.
  • Automatic template generation is described as an "extremely long process."
Health Check
Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
2 more.

ama_prompting by HazyResearch

0%
547
Language model prompting strategy research paper
Created 3 years ago
Updated 2 years ago
Starred by Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research) and Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

PromptWizard by microsoft

0.4%
4k
Agent-driven framework for task-aware prompt optimization
Created 1 year ago
Updated 1 month ago
Feedback? Help us improve.