gpt-2-tensorflow2.0  by akanyaani

GPT-2 implementation for sequence generation

Created 6 years ago
263 stars

Top 96.9% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides an implementation of OpenAI's GPT-2 model for pre-training and sequence generation using TensorFlow 2.0. It is designed for researchers and developers interested in replicating or extending GPT-2's capabilities within the TensorFlow ecosystem.

How It Works

The project implements the GPT-2 architecture, including the transformer decoder blocks, attention mechanisms, and positional encodings. It supports pre-training on custom datasets and generating text sequences based on provided context. The implementation leverages TensorFlow 2.0's eager execution and Keras API for a more Pythonic development experience.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites: Python >= 3.6, TensorFlow-GPU == 2.3.0, NumPy, setuptools, ftfy, tqdm, Click, sentencepiece.
  • Setup: Requires cloning the repository and installing dependencies. Pre-training involves data preprocessing.
  • Links: OpenAI GPT-2 Paper, OpenWebText

Highlighted Details

  • Supports distributed training across multiple GPUs.
  • Includes a sequence_generator.ipynb notebook for text generation.
  • Offers command-line arguments for configuring pre-training and training parameters.
  • Provides TensorBoard logging for monitoring training progress.

Maintenance & Community

  • Author: Abhay Kumar (akanyaani@gmail.com)
  • Contributions via issues and pull requests are welcome.

Licensing & Compatibility

  • License: MIT
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The project lists "Parallel Preprocessing" and a "Fine-Tuning wrapper" as future tasks, indicating these features are not yet implemented. The TensorFlow version is pinned to 2.3.0, which may limit compatibility with newer TensorFlow releases.

Health Check
Last Commit

2 years ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Casper Hansen Casper Hansen(Author of AutoAWQ), and
1 more.

GPT2 by ConnorJL

0%
1k
GPT2 training implementation, supporting TPUs and GPUs
Created 6 years ago
Updated 2 years ago
Starred by Lukas Biewald Lukas Biewald(Cofounder of Weights & Biases), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
2 more.

DialoGPT by microsoft

0.1%
2k
Response generation model via large-scale pretraining
Created 6 years ago
Updated 2 years ago
Feedback? Help us improve.