nndeploy  by nndeploy

Multi-platform inference deployment framework

Created 2 years ago
1,782 stars

Top 23.6% on SourcePulse

GitHubView on GitHub
Project Summary

This framework simplifies high-performance, multi-platform inference deployment for AI models. It targets developers needing to deploy models across diverse hardware and operating systems, offering a unified codebase and efficient execution.

How It Works

The core approach models deployment as a Directed Acyclic Graph (DAG), where preprocessing, inference, and postprocessing are distinct nodes. This DAG structure, with its "graph-in-graph" capability, allows for modularity and efficient composition of complex multi-model pipelines. The framework emphasizes performance through various parallel execution modes (serial, pipeline, task, and combined) and resource management via thread and memory pools.

Quick Start & Requirements

  • Installation: Compilation from source is the primary method.
  • Prerequisites: Specific build requirements depend on the target platform and inference backend. Linux is the most comprehensively supported OS.
  • Resources: Compilation time and resource usage will vary.
  • Documentation: 文档

Highlighted Details

  • Supports a wide range of inference backends including TensorRT, OpenVINO, ONNXRuntime, MNN, TNN, ncnn, coreML, AscendCL, and RKNN.
  • Enables zero-copy operations between preprocessing and inference for improved end-to-end performance.
  • Offers flexible parallel execution strategies for optimizing throughput and latency.
  • Facilitates rapid demo construction with support for various input/output formats.

Maintenance & Community

The project is actively developing and welcomes contributions. A WeChat group is available for community discussion.

Licensing & Compatibility

The project's licensing is not explicitly stated in the README, which may pose a compatibility concern for commercial or closed-source integration.

Limitations & Caveats

The framework is described as being in its development stage. Some features, such as the memory pool and high-performance operators, are still under development. Support for certain models (e.g., Stable Diffusion, QWen, SAM) is also noted as "in progress."

Health Check
Last Commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)
2
Issues (30d)
0
Star History
28 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Luis Capelo Luis Capelo(Cofounder of Lightning AI), and
3 more.

LitServe by Lightning-AI

0.1%
4k
AI inference pipeline framework
Created 2 years ago
Updated 2 days ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Woosuk Kwon Woosuk Kwon(Coauthor of vLLM), and
15 more.

torchtitan by pytorch

0.3%
5k
PyTorch platform for generative AI model training research
Created 2 years ago
Updated 22 hours ago
Feedback? Help us improve.