few-shot-learning by tonyzhaozh

Few-shot learning codebase for language models, similar to GPT-3

Created 4 years ago

356 stars

Top 78.6% on SourcePulse

View on GitHub

1 Expert Loves This Project

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Project Summary

This repository provides a codebase for performing few-shot "in-context" learning with large language models, mirroring the approach of the GPT-3 paper. It enables users to leverage models like GPT-3 (via OpenAI API), GPT-2, and others from HuggingFace Transformers by embedding a few training examples within a natural language prompt to guide predictions.

How It Works

The core mechanism involves constructing natural language prompts that include a small number of labeled examples (shots) for a given task. The language model then generates predictions based on these in-context examples without any gradient updates. The codebase abstracts model interaction through a common API, allowing for easy switching between different language models. It also supports contextual calibration, a technique to improve few-shot performance by adjusting model outputs.

Quick Start & Requirements

Installation: Create a conda environment (conda create -n fewshot python=3.6, source activate fewshot) and install dependencies (pip install -r requirements.txt).
Prerequisites: PyTorch, HuggingFace Transformers. A single GPU is required for local model execution (e.g., GPT-2); running without a GPU is possible but slow. For GPT-3, an openai_key.txt file with your API key is needed.
Replication: Example commands are provided for classification, extraction, and LAMA tasks using gpt2-xl. See Replicating Our Results for details.

Highlighted Details

Supports few-shot learning via in-context examples for various NLP tasks.
Integrates with OpenAI's GPT-3 and HuggingFace Transformers models.
Implements contextual calibration to enhance performance.
Codebase includes data loaders, model interaction utilities, and run scripts.
Outputs are pickled for fast post-hoc analysis and evaluation.

Maintenance & Community

Developed by Tony Z. Zhao and Eric Wallace. Contributions via pull requests and issues are welcome. Contact emails are provided for inquiries.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Users should verify licensing for commercial use or integration with closed-source projects.

Limitations & Caveats

The README notes that after code refactoring, the training sets may differ from those used in the original paper's results table, potentially leading to slight variations in replicated results.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days