gpt-3 by openai

Research paper on large language model few-shot learning

Created 5 years ago

15,765 stars

Top 3.1% on SourcePulse

View on GitHub

15 Experts Love This Project

Aravind Srinivas

Cofounder of Perplexity

Shizhe Diao

Author of LMFlow; Research Scientist at NVIDIA

Evan Hubinger

Head of Alignment Stress-Testing at Anthropic

Jiayi Pan

Author of SWE-Gym; MTS at xAI

and 11 more!

Project Summary

This repository provides access to GPT-3, a 175-billion parameter autoregressive language model designed for few-shot learning. It enables users to perform various NLP tasks, including translation, question-answering, and reasoning, with minimal or no task-specific fine-tuning, simply through text prompts.

How It Works

GPT-3 leverages its massive scale and extensive pre-training on a diverse web corpus to achieve strong performance in a few-shot setting. Instead of gradient updates, tasks are defined via text interaction, allowing the model to adapt to new tasks with only a few examples or instructions. This approach aims to bridge the gap between current NLP systems and human learning capabilities.

Highlighted Details

Achieves competitive performance on NLP benchmarks with few-shot learning, sometimes rivaling fine-tuned models.
Demonstrates capabilities in tasks requiring on-the-fly reasoning, such as unscrambling words and performing 3-digit arithmetic.
Can generate human-indistinguishable news articles.
Includes sample data, statistics, and a model card for the GPT-3 model.

Maintenance & Community

This project is associated with OpenAI. The primary contribution is the research paper "Language Models are Few-Shot Learners."

Licensing & Compatibility

The licensing and compatibility for commercial use or closed-source linking are not specified in the provided README.

Limitations & Caveats

The README notes that GPT-3 struggles with certain datasets and may exhibit methodological issues stemming from its web-scale training data, which can include offensive content.

Health Check

Last Commit

5 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days