nndeploy  by nndeploy

Multi-platform inference deployment framework

Created 2 years ago
1,192 stars

Top 32.7% on SourcePulse

GitHubView on GitHub
Project Summary

This framework simplifies high-performance, multi-platform inference deployment for AI models. It targets developers needing to deploy models across diverse hardware and operating systems, offering a unified codebase and efficient execution.

How It Works

The core approach models deployment as a Directed Acyclic Graph (DAG), where preprocessing, inference, and postprocessing are distinct nodes. This DAG structure, with its "graph-in-graph" capability, allows for modularity and efficient composition of complex multi-model pipelines. The framework emphasizes performance through various parallel execution modes (serial, pipeline, task, and combined) and resource management via thread and memory pools.

Quick Start & Requirements

  • Installation: Compilation from source is the primary method.
  • Prerequisites: Specific build requirements depend on the target platform and inference backend. Linux is the most comprehensively supported OS.
  • Resources: Compilation time and resource usage will vary.
  • Documentation: 文档

Highlighted Details

  • Supports a wide range of inference backends including TensorRT, OpenVINO, ONNXRuntime, MNN, TNN, ncnn, coreML, AscendCL, and RKNN.
  • Enables zero-copy operations between preprocessing and inference for improved end-to-end performance.
  • Offers flexible parallel execution strategies for optimizing throughput and latency.
  • Facilitates rapid demo construction with support for various input/output formats.

Maintenance & Community

The project is actively developing and welcomes contributions. A WeChat group is available for community discussion.

Licensing & Compatibility

The project's licensing is not explicitly stated in the README, which may pose a compatibility concern for commercial or closed-source integration.

Limitations & Caveats

The framework is described as being in its development stage. Some features, such as the memory pool and high-performance operators, are still under development. Support for certain models (e.g., Stable Diffusion, QWen, SAM) is also noted as "in progress."

Health Check
Last Commit

14 hours ago

Responsiveness

1 day

Pull Requests (30d)
1
Issues (30d)
11
Star History
62 stars in the last 30 days

Explore Similar Projects

Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
1 more.

VeOmni by ByteDance-Seed

3.4%
1k
Framework for scaling multimodal model training across accelerators
Created 5 months ago
Updated 3 weeks ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
3 more.

LitServe by Lightning-AI

0.3%
4k
AI inference pipeline framework
Created 1 year ago
Updated 1 day ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
13 more.

torchtitan by pytorch

0.7%
4k
PyTorch platform for generative AI model training research
Created 1 year ago
Updated 21 hours ago
Feedback? Help us improve.