aitextgen by minimaxir

Python tool for text-based AI training and generation

Created 6 years ago

1,842 stars

Top 23.3% on SourcePulse

View on GitHub

3 Experts Love This Project

Lysandre Debut

Chief Open-Source Officer at Hugging Face

Project Summary

This Python package provides a robust tool for text-based AI training and generation using GPT-2 and EleutherAI's GPT Neo/GPT-3 architectures. It's designed for researchers and developers looking to fine-tune or train custom language models efficiently, offering enhanced speed and memory usage compared to previous tools.

How It Works

Leveraging PyTorch, Hugging Face Transformers, and PyTorch Lightning, aitextgen enables training on CPUs, multiple GPUs, and eventually TPUs. It supports OpenAI's GPT-2 models (124M to 774M parameters) and EleutherAI's GPT Neo models (125M to 350M parameters), or allows training from scratch with custom tokenizers and configurations. Its dataset handling includes caching, compression, and merging capabilities for efficient data management.

Quick Start & Requirements

Install via pip: pip3 install aitextgen
Basic generation: from aitextgen import aitextgen; ai = aitextgen(); ai.generate()
Command-line generation: aitextgen generate
Training requires a dataset file (e.g., input.txt) and optionally a custom tokenizer.
Documentation and Colab notebooks are available for detailed guidance.

Highlighted Details

Faster generation and better memory efficiency than gpt-2-simple.
Preserves compatibility with Hugging Face Transformers for broader NLP task use.
Supports training on CPUs, multiple GPUs, and includes progress bars with logging.
Advanced dataset management for efficient encoding, caching, compression, and merging.

Maintenance & Community

Maintained by Max Woolf (@minimaxir).
Project funding is supported via Patreon and GitHub Sponsors.
Upcoming features include native support for schema-based generation and a potential SaaS offering.

Licensing & Compatibility

MIT License.
Compatible with commercial use and closed-source linking.

Limitations & Caveats

TPU training is currently blocked by miscellaneous issues, though loss decreases.
The current release (v0.5.X) is considered beta, with documentation and use cases still being fleshed out.

Health Check

Last Commit

2 years ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days