few-shot-learning  by tonyzhaozh

Few-shot learning codebase for language models, similar to GPT-3

created 4 years ago
352 stars

Top 80.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a codebase for performing few-shot "in-context" learning with large language models, mirroring the approach of the GPT-3 paper. It enables users to leverage models like GPT-3 (via OpenAI API), GPT-2, and others from HuggingFace Transformers by embedding a few training examples within a natural language prompt to guide predictions.

How It Works

The core mechanism involves constructing natural language prompts that include a small number of labeled examples (shots) for a given task. The language model then generates predictions based on these in-context examples without any gradient updates. The codebase abstracts model interaction through a common API, allowing for easy switching between different language models. It also supports contextual calibration, a technique to improve few-shot performance by adjusting model outputs.

Quick Start & Requirements

  • Installation: Create a conda environment (conda create -n fewshot python=3.6, source activate fewshot) and install dependencies (pip install -r requirements.txt).
  • Prerequisites: PyTorch, HuggingFace Transformers. A single GPU is required for local model execution (e.g., GPT-2); running without a GPU is possible but slow. For GPT-3, an openai_key.txt file with your API key is needed.
  • Replication: Example commands are provided for classification, extraction, and LAMA tasks using gpt2-xl. See Replicating Our Results for details.

Highlighted Details

  • Supports few-shot learning via in-context examples for various NLP tasks.
  • Integrates with OpenAI's GPT-3 and HuggingFace Transformers models.
  • Implements contextual calibration to enhance performance.
  • Codebase includes data loaders, model interaction utilities, and run scripts.
  • Outputs are pickled for fast post-hoc analysis and evaluation.

Maintenance & Community

Developed by Tony Z. Zhao and Eric Wallace. Contributions via pull requests and issues are welcome. Contact emails are provided for inquiries.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Users should verify licensing for commercial use or integration with closed-source projects.

Limitations & Caveats

The README notes that after code refactoring, the training sets may differ from those used in the original paper's results table, potentially leading to slight variations in replicated results.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), and
4 more.

open_flamingo by mlfoundations

0.1%
4k
Open-source framework for training large multimodal models
created 2 years ago
updated 11 months ago
Feedback? Help us improve.