nndeploy  by nndeploy

Multi-platform inference deployment framework

Created 2 years ago
1,743 stars

Top 24.1% on SourcePulse

GitHubView on GitHub
Project Summary

This framework simplifies high-performance, multi-platform inference deployment for AI models. It targets developers needing to deploy models across diverse hardware and operating systems, offering a unified codebase and efficient execution.

How It Works

The core approach models deployment as a Directed Acyclic Graph (DAG), where preprocessing, inference, and postprocessing are distinct nodes. This DAG structure, with its "graph-in-graph" capability, allows for modularity and efficient composition of complex multi-model pipelines. The framework emphasizes performance through various parallel execution modes (serial, pipeline, task, and combined) and resource management via thread and memory pools.

Quick Start & Requirements

  • Installation: Compilation from source is the primary method.
  • Prerequisites: Specific build requirements depend on the target platform and inference backend. Linux is the most comprehensively supported OS.
  • Resources: Compilation time and resource usage will vary.
  • Documentation: 文档

Highlighted Details

  • Supports a wide range of inference backends including TensorRT, OpenVINO, ONNXRuntime, MNN, TNN, ncnn, coreML, AscendCL, and RKNN.
  • Enables zero-copy operations between preprocessing and inference for improved end-to-end performance.
  • Offers flexible parallel execution strategies for optimizing throughput and latency.
  • Facilitates rapid demo construction with support for various input/output formats.

Maintenance & Community

The project is actively developing and welcomes contributions. A WeChat group is available for community discussion.

Licensing & Compatibility

The project's licensing is not explicitly stated in the README, which may pose a compatibility concern for commercial or closed-source integration.

Limitations & Caveats

The framework is described as being in its development stage. Some features, such as the memory pool and high-performance operators, are still under development. Support for certain models (e.g., Stable Diffusion, QWen, SAM) is also noted as "in progress."

Health Check
Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
58 stars in the last 30 days

Explore Similar Projects

Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
1 more.

VeOmni by ByteDance-Seed

0.3%
2k
Framework for scaling multimodal model training across accelerators
Created 11 months ago
Updated 18 hours ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
3 more.

LitServe by Lightning-AI

0.1%
4k
AI inference pipeline framework
Created 2 years ago
Updated 21 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Woosuk Kwon Woosuk Kwon(Coauthor of vLLM), and
15 more.

torchtitan by pytorch

0.2%
5k
PyTorch platform for generative AI model training research
Created 2 years ago
Updated 19 hours ago
Feedback? Help us improve.