ArcticTraining  by snowflakedb

LLM post-training acceleration framework

Created 10 months ago
254 stars

Top 99.0% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

ArcticTraining is an open-source framework to simplify and accelerate LLM post-training. It addresses challenges like limited rapid prototyping support and lack of native synthetic data generation tools. By offering modular trainer designs, streamlined code, and integrated pipelines for synthetic data creation/cleaning, ArcticTraining empowers users to efficiently enhance LLM capabilities in code generation and complex reasoning, providing a flexible development experience.

How It Works

The framework emphasizes modularity and customization. Core components include modular trainer designs and simplified code structures for rapid iteration. A key differentiator is its integrated pipeline for native synthetic data generation and cleaning. Users can extend ArcticTraining by subclassing Trainer or SFTTrainer, allowing custom loss functions or training methodologies. This flexible design aims to boost LLM performance in tasks like code generation and complex reasoning with improved efficiency.

Quick Start & Requirements

Installation: pip install arctic-training. Training uses a YAML recipe file and the arctic_training CLI, leveraging DeepSpeed for distributed training. Customization involves modifying YAML or developing new trainers via subclassing. Further details are in the project's blog and documentation.

Highlighted Details

  • Arctic Long Sequence Training (ALST): Enables scalable training for LLMs handling multi-million token sequences.
  • Arctic Inference & Speculative Decoding: Integrates with vLLM for accelerated inference, featuring fast speculative decoding.
  • Arctic-Embed: Tools for simple, scalable training of embedding models.
  • SwiftKV: Accelerates enterprise LLM workloads via knowledge-preserving compute reduction.
  • DeepSpeed Integration: Seamlessly leverages DeepSpeed for robust distributed training.

Maintenance & Community

The project receives GPU CI funding from Modal. The README content lacks links to community channels (Discord, Slack) or a public roadmap.

Licensing & Compatibility

The specific open-source license and compatibility notes for commercial use or closed-source linking are not explicitly mentioned in the provided README text.

Limitations & Caveats

The current README content does not specify explicit limitations, unsupported platforms, or known bugs. Documentation focuses on features and extensions rather than constraints.

Health Check
Last Commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)
13
Issues (30d)
2
Star History
20 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.