gpt-2  by openai

Code for research paper "Language Models are Unsupervised Multitask Learners"

created 6 years ago
23,986 stars

Top 1.7% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the code and models for OpenAI's GPT-2 language model, as described in their "Language Models are Unsupervised Multitask Learners" paper. It serves as a starting point for researchers and engineers to experiment with GPT-2's capabilities, particularly for exploring its unsupervised multitask learning potential.

How It Works

GPT-2 is a transformer-based language model that generates text by predicting the next word in a sequence. Its architecture allows it to perform a wide range of tasks without explicit task-specific training, demonstrating the power of large-scale unsupervised learning.

Quick Start & Requirements

  • Install: pip install gpt-2
  • Prerequisites: Python 3.5+, TensorFlow 1.10.0+ or PyTorch 1.0+.
  • Models: Pre-trained models of various sizes (small, medium, large, XL) are available for download.
  • Documentation: Model Card

Highlighted Details

  • Code and models from the seminal "Language Models are Unsupervised Multitask Learners" paper.
  • Staged release approach detailed in accompanying blog posts.
  • Correction of previously reported parameter counts for model sizes.

Maintenance & Community

  • Status: Archived; no further updates are expected.
  • Community: Open to collaboration on research and applications, especially concerning malicious use, defenses, and bias mitigation.

Licensing & Compatibility

  • License: Modified MIT.
  • Compatibility: Permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The models are provided as-is, with no updates planned. GPT-2's robustness and worst-case behaviors are not fully understood, and it may exhibit biases and factual inaccuracies present in its training data. Generated text should be clearly labeled as synthetic, as models can be subtly incoherent or inaccurate.

Health Check
Last commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
1
Star History
688 stars in the last 90 days

Explore Similar Projects

Starred by Elie Bursztein Elie Bursztein(Cybersecurity Lead at Google DeepMind), Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), and
5 more.

gpt-neo by EleutherAI

0.0%
8k
GPT-2/3-style model implementation using mesh-tensorflow
created 5 years ago
updated 3 years ago
Feedback? Help us improve.