1.5-Pints by Pints-AI

LLM recipe for pre-training models

Created 1 year ago

340 stars

Top 81.2% on SourcePulse

Project Summary

This repository provides the architecture, training scripts, and utilities for the 1.5-Pints and 0.12-Pint language models, designed to be comparable to models like OpenELM and Phi. It targets researchers and developers interested in replicating, experimenting with, and advancing open-source LLM pre-training, offering a recipe for achieving competitive performance in significantly reduced training times.

How It Works

The project emphasizes a "quality data" approach to pre-training, enabling rapid development of capable LLMs. It leverages PyTorch Lightning for distributed training and includes scripts for dataset preparation, model pre-training, fine-tuning, and evaluation. The architecture configurations are managed within lit_gpt/config.py, allowing users to select different model sizes and parameters.

Quick Start & Requirements

Installation: Requires Ubuntu 22.04 LTS or Debian 12 (x86-64 only; ARM64 is not supported).
Prerequisites: Miniconda3 for environment management, CUDA Toolkit 12.1.1 (installed within the conda environment), git-lfs. Python 3.10 is recommended.
Setup: Clone the repo, create and activate a conda environment, install dependencies (pip install -r requirements.txt, pip install flash-attn --no-build-isolation, pip install -r pretrain/requirements.txt), download and prepare datasets, and then train using fabric run.
Links: Discord: https://discord.com/invite/RSHk22Z29j, Paper: https://arxiv.org/abs/2408.03506.

Highlighted Details

Pre-training achieved in 9 days, aiming for parity with established models.
Detailed scripts for pre-training, fine-tuning, and Direct Preference Optimization (DPO).
Utilities for converting trained models to Hugging Face format (PyTorch and Safetensors).
Includes a testing suite for code validation.

Maintenance & Community

The project is developed by Pints.AI. Community support and discussion are available via their Discord server.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

ARM64/aarch64 processors are not supported due to xformers incompatibility. Python 3.12 is noted to break functionality, and Python 3.11 has not been tested. The installation process requires careful management of CUDA versions within conda environments.

1.5-Pints by Pints-AI

Explore Similar Projects

awesome-open-source-lms by allenai

hackathon by mistralai-sf24

OpenLTM by thuml

naifu by Mikubill

nano-aha-moment by McGill-NLP

gomlx by gomlx

MINI_LLM by jiahe7ay

SpanBERT by facebookresearch

direct-preference-optimization by eric-mitchell

FlagAI by FlagAI-Open

corenet by apple

LLMSurvey by RUCAIBox