burn  by tracel-ai

Deep learning framework prioritizing flexibility, efficiency, and portability

created 3 years ago
12,401 stars

Top 4.1% on sourcepulse

GitHubView on GitHub
Project Summary

Burn is a next-generation deep learning framework built in Rust, prioritizing flexibility, efficiency, and portability. It targets researchers and engineers seeking high performance across diverse hardware without the typical friction of Python-based frameworks, offering automatic optimizations and a robust backend system.

How It Works

Burn leverages Rust's ownership system for thread-safe building blocks and employs automatic kernel fusion to optimize computations dynamically, creating custom kernels that rival handcrafted GPU implementations. It features an asynchronous execution model for improved responsiveness and framework overhead reduction. Intelligent memory management, including memory pooling and in-place mutation tracking, further enhances efficiency.

Quick Start & Requirements

  • Install via cargo add burn.
  • Requires Rust toolchain.
  • For GPU acceleration, specific backend dependencies (CUDA, Metal, Vulkan, WGPU) are needed.
  • See The Burn Book for detailed guidance.

Highlighted Details

  • Automatic Kernel Fusion: Dynamically generates optimized kernels (e.g., WGSL for WGPU) for custom operations.
  • Backend Agnostic Design: Supports multiple backends (CUDA, ROCm, Metal, Vulkan, WGPU, NdArray, LibTorch, Candle) via a generic Backend trait.
  • Decorator Pattern: Backends can be enhanced with features like Autodiff and Fusion through composition.
  • no_std Support: Core components are compatible with bare-metal embedded environments (currently via NdArray backend).

Maintenance & Community

Licensing & Compatibility

  • Licensed under MIT and Apache 2.0.
  • Compatible with commercial and closed-source applications.

Limitations & Caveats

  • The project is in active development, with potential for breaking changes.
  • ONNX support is limited to a subset of operators.
  • WGPU backends may require increasing the Rust recursion limit (#![recursion_limit = "256"]).
  • Loading model records from versions older than 0.14.0 requires the record-backward-compat feature flag.
Health Check
Last commit

19 hours ago

Responsiveness

1 day

Pull Requests (30d)
95
Issues (30d)
36
Star History
1,571 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Georgios Konstantopoulos Georgios Konstantopoulos(CTO, General Partner at Paradigm), and
7 more.

ThunderKittens by HazyResearch

0.6%
3k
CUDA kernel framework for fast deep learning primitives
created 1 year ago
updated 3 days ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jiayi Pan Jiayi Pan(Author of SWE-Gym; AI Researcher at UC Berkeley), and
16 more.

flash-attention by Dao-AILab

0.7%
19k
Fast, memory-efficient attention implementation
created 3 years ago
updated 18 hours ago
Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), Anton Bukov Anton Bukov(Cofounder of 1inch Network), and
16 more.

tinygrad by tinygrad

0.1%
30k
Minimalist deep learning framework for education and exploration
created 4 years ago
updated 18 hours ago
Feedback? Help us improve.