create-llm by theaniketgiri

Scaffolding LLM training projects

Created 6 months ago

314 stars

Top 86.2% on SourcePulse

Project Summary

Summary

This project addresses the complexity of building and training custom Large Language Models (LLMs) by providing a CLI tool that scaffolds production-ready PyTorch training projects rapidly. It targets engineers and researchers seeking an accelerated path to LLM development, offering a streamlined, "create-next-app" like experience for custom model creation.

How It Works

The tool scaffolds projects using PyTorch, offering four right-sized templates (NANO, TINY, SMALL, BASE) optimized for different use cases from learning to research-grade models. It bundles a complete toolkit including data preprocessing pipelines, multiple tokenizer training options (BPE, WordPiece, Unigram), robust training loops with checkpoint management, evaluation metrics, text generation utilities, and deployment scripts. Smart defaults intelligently configure training parameters, while an optional plugin system integrates with tools like WandB and HuggingFace.

Quick Start & Requirements

Primary Install/Run: npx @theanikrtgiri/create-llm <project-name> (recommended).
Prerequisites:
- CLI: Node.js 18.0.0+, npm 8.0.0+.
- Training: Python 3.8+, PyTorch 2.0.0+.
- Hardware: Minimum 4GB RAM (NANO/TINY), 12GB VRAM recommended (SMALL), 40GB+ VRAM for BASE.
- Docker: Docker 20.10+, NVIDIA Docker for GPU support.
Links: GitHub, npm.

Highlighted Details

Template Variety: Four distinct templates (NANO, TINY, SMALL, BASE) cater to specific needs, ranging from 1M to 1B parameters, with corresponding hardware and time estimates.
Comprehensive Feature Set: Out-of-the-box support for data preparation, tokenizer training, checkpointing, TensorBoard, live dashboards, interactive chat, and deployment.
Intelligent Defaults & Interactivity: Features smart configuration, auto-detection of parameters, error diagnostics, and interactive prompts for a guided setup experience.
Docker First: Strong emphasis on Docker for consistent environments, eliminating local Node.js/Python dependencies and simplifying GPU utilization.

Maintenance & Community

The project is maintained by Aniket Giri. Contributions are welcomed, with specific areas for improvement outlined. Community interaction is primarily through GitHub issues.

Licensing & Compatibility

Released under the MIT License, permitting broad use, modification, and distribution, including for commercial purposes and integration into closed-source projects.

Limitations & Caveats

Training larger models (SMALL, BASE) necessitates substantial GPU VRAM (12GB+, 40GB+). The effectiveness of smaller templates is dependent on sufficient data quantity and quality. While common issues are addressed, complex LLM training may still encounter unforeseen challenges.

create-llm by theaniketgiri

Explore Similar Projects

base-llm by datawhalechina

hackathon by mistralai-sf24

toolformer by conceptofmind

Zero-Chatgpt by AI-Study-Han

bce-qianfan-sdk by baidubce

awesome-ml by underlines

edu by wandb

AI-Guide-and-Demos-zh_CN by Hoper-J

llms-from-scratch-cn by datawhalechina

Awesome-LLMOps by tensorchord

litgpt by Lightning-AI

awesome-tensorflow by jtoy