LLM recipe for pre-training models
Top 86.0% on sourcepulse
This repository provides the architecture, training scripts, and utilities for the 1.5-Pints and 0.12-Pint language models, designed to be comparable to models like OpenELM and Phi. It targets researchers and developers interested in replicating, experimenting with, and advancing open-source LLM pre-training, offering a recipe for achieving competitive performance in significantly reduced training times.
How It Works
The project emphasizes a "quality data" approach to pre-training, enabling rapid development of capable LLMs. It leverages PyTorch Lightning for distributed training and includes scripts for dataset preparation, model pre-training, fine-tuning, and evaluation. The architecture configurations are managed within lit_gpt/config.py
, allowing users to select different model sizes and parameters.
Quick Start & Requirements
git-lfs
. Python 3.10 is recommended.pip install -r requirements.txt
, pip install flash-attn --no-build-isolation
, pip install -r pretrain/requirements.txt
), download and prepare datasets, and then train using fabric run
.Highlighted Details
Maintenance & Community
The project is developed by Pints.AI. Community support and discussion are available via their Discord server.
Licensing & Compatibility
The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
ARM64/aarch64 processors are not supported due to xformers
incompatibility. Python 3.12 is noted to break functionality, and Python 3.11 has not been tested. The installation process requires careful management of CUDA versions within conda environments.
3 months ago
1 week