picotron by huggingface

Minimalist distributed training framework for educational use

Created 1 year ago

1,939 stars

Top 22.4% on SourcePulse

8 Experts Love This Project

karpathy

Andrej Karpathy

Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n

transitive-bullshit

Founder of Agentic

srush

Research Scientist at Cursor; Professor at Cornell Tech

pgarbacki

Cofounder of Fireworks AI

and 4 more!

Project Summary

Picotron is a minimalist, hackable distributed training framework designed for educational purposes, enabling users to learn and experiment with 4D parallelism (Data, Tensor, Pipeline, Context) for large language models. It offers a simplified codebase, making complex distributed training concepts accessible to researchers and students.

How It Works

Picotron implements 4D parallelism by breaking down model training across data, tensor, pipeline, and context dimensions. This approach allows for efficient distribution of large models and datasets across multiple GPUs, facilitating training of models that would otherwise be too large for single devices. The framework prioritizes code readability and simplicity, with core components like train.py and parallelism strategies under 300 lines each.

Quick Start & Requirements

Health Check

Last Commit

4 months ago

Responsiveness

1 week

Pull Requests (30d)

0

Issues (30d)

0

Star History

27 stars in the last 30 days

Explore Similar Projects

LLM-RLHF-Tuning by Joyce94

LLM tuning via RLHF (SFT+RM+PPO+DPO) with LoRA

Created 2 years ago

Updated 2 years ago

Starred by

Shizhe Diao

Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA),

Eugene Yan

Eugene Yan(AI Scientist at AWS), and

1 more.

LLaMA-Cult-and-More by shm007g

LLM resource list for models, datasets, training, and evaluation

Created 2 years ago

Updated 2 years ago

Starred by

Johannes Hagemann

Johannes Hagemann(Cofounder of Prime Intellect),

Gabriel Almeida

Gabriel Almeida(Cofounder of Langflow), and

2 more.

lumos by allenai

Agent for unified data, modular design, and open-source LLMs

Created 2 years ago

Updated 1 year ago

lightron by lwj2015

A lightweight, educational LLM distributed training framework

Created 3 weeks ago

Updated 2 weeks ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera),

Philipp Schmid

Philipp Schmid(DevRel at Google DeepMind), and

2 more.

Megatron-LLM by epfLLM

Distributed trainer for LLMs

Created 2 years ago

Updated 1 year ago

TinyLLaVA_Factory by TinyLLaVA

Framework for small-scale large multimodal models research

Created 1 year ago

Updated 8 months ago

tiny-llm-zh by wdndev

Chinese LLM for learning large language models

Created 1 year ago

Updated 1 year ago

step_into_llm by mindspore-lab

Online course for large language model (LLM) techniques using MindSpore

Created 2 years ago

Updated 2 weeks ago

Starred by

Chuan Li

Chuan Li(Chief Scientific Officer at Lambda).

NeMo-Framework-Launcher by NVIDIA

Cloud-native tool for launching NeMo framework training jobs

Created 3 years ago

Updated 8 months ago

Starred by

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

6 more.

LLaMA-Adapter by OpenGVLab

Efficient fine-tuning for instruction-following LLaMA models

Created 2 years ago

Updated 1 year ago

Starred by

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n),

George Hotz

George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and

21 more.

TinyLlama by jzhang38

Tiny pretraining project for a 1.1B Llama model

Created 2 years ago

Updated 1 year ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

minimind by jingyaogong

Minimal LLM training from scratch, under 3 USD and in 2 hours

Created 1 year ago

Updated 4 days ago

Feedback? Help us improve.