open-infra-index by deepseek-ai

AI infrastructure tools for efficient AGI development

Created 10 months ago

7,948 stars

Top 6.5% on SourcePulse

View on GitHub

20 Experts Love This Project

Chris Lattner

Author of LLVM, Clang, Swift, Mojo, MLIR; Cofounder of Modular

Vincent Weisser

Cofounder of Prime Intellect

Jeff Hammerbacher

Cofounder of Cloudera

Pankaj Gupta

Cofounder of Baseten

and 16 more!

Project Summary

This repository provides a suite of production-tested AI infrastructure tools from DeepSeek AI, aimed at accelerating AGI development. It targets researchers and engineers working with large-scale AI models, offering components for efficient inference, distributed training, and data handling.

How It Works

The project releases components incrementally, focusing on performance and efficiency for large models. Key technologies include FlashMLA for optimized MLA decoding on Hopper GPUs, DeepEP for efficient MoE communication, DeepGEMM for FP8 GEMM operations, DualPipe for pipeline parallelism, EPLB for expert load balancing, and 3FS (Fire-Flyer File System) for high-throughput data access. These tools are designed to work together, enabling computation-communication overlap and efficient resource utilization.

Quick Start & Requirements

Installation and usage details are provided per component's respective GitHub repository (linked in the README).
Requires NVIDIA GPUs (Hopper architecture recommended for full performance), CUDA, and specific Python versions depending on the component.
Setup complexity varies by component, with some offering minimal dependencies.

Highlighted Details

FlashMLA achieves 3000 GB/s memory-bound and 580 TFLOPS compute-bound (BF16) on H800 GPUs.
DeepGEMM offers up to 1350+ FP8 TFLOPS on Hopper GPUs with a minimal ~300 lines of core logic.
3FS demonstrates 6.6 TiB/s aggregate read throughput on an 180-node cluster and 40+ GiB/s per client for KVCache lookup.
The DeepSeek-V3/R1 inference system achieves 73.7k/14.8k input/output tokens per second per H800 node.

Maintenance & Community

Developed by a small team at DeepSeek AI.
The project is part of a daily open-sourcing initiative, indicating active development and a commitment to transparency.
Links to individual GitHub repositories are provided for each component.

Licensing & Compatibility

The README does not explicitly state a license for the open-infra-index repository itself or the individual components. Further investigation into each linked GitHub repository is required for licensing details and commercial use compatibility.

Limitations & Caveats

The project is presented as a series of "humble building blocks" and "small-but-sincere progress," suggesting components may be in early stages or have specific use-case optimizations.
Full performance claims are tied to specific hardware (Hopper GPUs, H800) and configurations.
Licensing information is not consolidated, requiring users to check each component's repository.

Health Check

Last Commit

8 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

20 stars in the last 30 days