LM-BFF by princeton-nlp

Research paper on few-shot fine-tuning of language models

Created 5 years ago

729 stars

Top 47.4% on SourcePulse

View on GitHub

2 Experts Love This Project

Alexander Wettig

Coauthor of SWE-bench, SWE-agent

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Project Summary

LM-BFF provides a framework for improving few-shot fine-tuning of pre-trained language models. It targets researchers and practitioners working with limited labeled data, offering techniques to enhance model performance on downstream NLP tasks. The core benefit is achieving better few-shot learning capabilities through structured fine-tuning strategies.

How It Works

LM-BFF combines prompt-based fine-tuning with a novel pipeline for automating prompt generation and a refined strategy for incorporating demonstrations into the model's context. This approach aims to guide the language model more effectively with limited examples, leveraging the power of prompts and relevant demonstrations to improve generalization.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Data preparation: Download datasets to ./data/original and run python tools/generate_k_shot_data.py.
Run example: python run.py --task_name SST-2 --data_dir data/k-shot/SST-2/16-42 --model_name_or_path roberta-large --few_shot_type prompt-demo ...
Key dependencies: transformers (version 3.4.0 recommended), pytorch.
See official quick-start and detailed examples in the README.

Highlighted Details

Supports prompt-based fine-tuning, prompt-based fine-tuning with demonstrations, and standard fine-tuning.
Includes mechanisms for automatic prompt and label word mapping generation and selection.
Offers demonstration filtering using Sentence-BERT embeddings.
Supports ensemble methods for improved robustness.
Enables zero-shot and GPT-3 style in-context learning experiments.

Maintenance & Community

The project is associated with Princeton NLP.
For questions or bug reports, contact tianyug@cs.princeton.edu or open an issue on GitHub.

Licensing & Compatibility

The README does not explicitly state a license. However, the citation indicates it's from ACL 2021, suggesting academic use. Compatibility for commercial or closed-source linking is not specified.

Limitations & Caveats

The README notes that different package versions (e.g., pytorch, transformers) may lead to results deviating from the paper, though trends should persist.
Automatic template generation is described as an "extremely long process."

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days