lightron by lwj2015

A lightweight, educational LLM distributed training framework

Created 2 months ago

513 stars

Top 61.1% on SourcePulse

Project Summary

Lightron is a lightweight, educational, and modern distributed training framework for Large Language Models (LLMs). It aims to bridge the gap between minimal research implementations and production-ready features, offering a clean and efficient platform for LLM development and study. The framework benefits researchers and students by providing access to advanced techniques in a streamlined package.

How It Works

Lightron employs a modern architectural design incorporating RMSNorm, SwiGLU activation, and Rotary Embeddings (RoPE). For enhanced efficiency, it leverages native PyTorch scaled_dot_product_attention, which is equivalent to FlashAttention-2. The framework provides first-class support for PyTorch Fully Sharded Data Parallel (FSDP), enabling robust distributed training capabilities. Its core codebase is kept concise, under 1000 lines, and features type-hinted, dataclass-based configuration for clarity and maintainability.

Quick Start & Requirements

Installation involves cloning the repository, navigating into the directory, and installing dependencies via pip install -r requirements.txt. The primary execution command for distributed training on 4 GPUs is torchrun --nproc_per_node=4 examples/train_llama.py. While specific hardware requirements like CUDA versions are not detailed, the example implies multi-GPU setups are intended. The official GitHub repository serves as the primary resource: https://github.com/lwj2015/lightron.

Highlighted Details

Modern LLM architecture components: RMSNorm, SwiGLU, Rotary Embeddings (RoPE).
Efficiency through native PyTorch scaled_dot_product_attention (FlashAttention-2 equivalent).
First-class support for PyTorch FSDP for distributed training.
Compatibility with Llama-3 architectures.
Clean, type-hinted, dataclass-based configuration with a core codebase under 1000 lines.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord or Slack), or project roadmap were provided in the README snippet.

Licensing & Compatibility

The license type and any compatibility notes for commercial use or closed-source linking are not specified in the provided README content.

Limitations & Caveats

The README snippet does not detail any specific limitations, known bugs, alpha status, or unsupported platforms. The focus appears to be on core functionality for research and study.

Health Check

Last Commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days