aitextgen  by minimaxir

Python tool for text-based AI training and generation

created 5 years ago
1,844 stars

Top 24.0% on sourcepulse

GitHubView on GitHub
Project Summary

This Python package provides a robust tool for text-based AI training and generation using GPT-2 and EleutherAI's GPT Neo/GPT-3 architectures. It's designed for researchers and developers looking to fine-tune or train custom language models efficiently, offering enhanced speed and memory usage compared to previous tools.

How It Works

Leveraging PyTorch, Hugging Face Transformers, and PyTorch Lightning, aitextgen enables training on CPUs, multiple GPUs, and eventually TPUs. It supports OpenAI's GPT-2 models (124M to 774M parameters) and EleutherAI's GPT Neo models (125M to 350M parameters), or allows training from scratch with custom tokenizers and configurations. Its dataset handling includes caching, compression, and merging capabilities for efficient data management.

Quick Start & Requirements

  • Install via pip: pip3 install aitextgen
  • Basic generation: from aitextgen import aitextgen; ai = aitextgen(); ai.generate()
  • Command-line generation: aitextgen generate
  • Training requires a dataset file (e.g., input.txt) and optionally a custom tokenizer.
  • Documentation and Colab notebooks are available for detailed guidance.

Highlighted Details

  • Faster generation and better memory efficiency than gpt-2-simple.
  • Preserves compatibility with Hugging Face Transformers for broader NLP task use.
  • Supports training on CPUs, multiple GPUs, and includes progress bars with logging.
  • Advanced dataset management for efficient encoding, caching, compression, and merging.

Maintenance & Community

  • Maintained by Max Woolf (@minimaxir).
  • Project funding is supported via Patreon and GitHub Sponsors.
  • Upcoming features include native support for schema-based generation and a potential SaaS offering.

Licensing & Compatibility

  • MIT License.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

  • TPU training is currently blocked by miscellaneous issues, though loss decreases.
  • The current release (v0.5.X) is considered beta, with documentation and use cases still being fleshed out.
Health Check
Last commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.