applied-ai by meta-pytorch

Applied AI experiments and examples for PyTorch

Created 2 years ago

312 stars

Top 86.5% on SourcePulse

View on GitHub

2 Experts Love This Project

Wing Lian

Founder of Axolotl AI

Stas Bekman

Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake

Project Summary

This repository provides a collection of applied AI experiments and examples, primarily focusing on PyTorch. It targets researchers and engineers looking to leverage optimized kernels and explore advanced techniques for efficient model training and inference, offering practical implementations for cutting-edge AI research.

How It Works

The core of the repository features custom Triton and CUDA kernels designed to accelerate specific operations. These include Mixture-of-Experts (MoE) GEMM for Mixtral inference, fused Softmax, and fused RMSNorm, all aimed at improving performance by optimizing memory access patterns and fusing operations. The focus is on inference acceleration and efficiency for both training and inference workloads.

Quick Start & Requirements

Installation: Typically involves cloning the repository and installing dependencies via pip.
Prerequisites: PyTorch, Triton, CUDA (for CUDA kernels), and potentially specific model dependencies (e.g., for Llama).
Resources: Requires a GPU for running the optimized kernels and experiments.

Highlighted Details

Triton kernels for MoE (Mixtral) GEMM, fused Softmax, and fused RMSNorm.
CUDA programming reading group materials and recordings.
Recipes for fine-tuning and inference of Llama models.
Contributions to NeurIPS LLM Efficiency Challenge.
Published papers on PyTorch 2, Triton kernel acceleration, and FSDP.

Maintenance & Community

The project is associated with pytorch-labs.
Links to Discord and lecture materials for the CUDA reading group are provided.

Licensing & Compatibility

License: BSD 3-Clause.
Compatibility: Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

Some kernels are explicitly noted as supporting inference only, meaning they do not include backward pass support for training. The repository contains experimental code, and users should be aware of potential instability or ongoing development.

Health Check

Last Commit

4 months ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days