nndeploy by nndeploy

Multi-platform inference deployment framework

Created 2 years ago

1,681 stars

Top 25.0% on SourcePulse

Project Summary

This framework simplifies high-performance, multi-platform inference deployment for AI models. It targets developers needing to deploy models across diverse hardware and operating systems, offering a unified codebase and efficient execution.

How It Works

The core approach models deployment as a Directed Acyclic Graph (DAG), where preprocessing, inference, and postprocessing are distinct nodes. This DAG structure, with its "graph-in-graph" capability, allows for modularity and efficient composition of complex multi-model pipelines. The framework emphasizes performance through various parallel execution modes (serial, pipeline, task, and combined) and resource management via thread and memory pools.

Quick Start & Requirements

Installation: Compilation from source is the primary method.
Prerequisites: Specific build requirements depend on the target platform and inference backend. Linux is the most comprehensively supported OS.
Resources: Compilation time and resource usage will vary.
Documentation: 文档

Highlighted Details

Supports a wide range of inference backends including TensorRT, OpenVINO, ONNXRuntime, MNN, TNN, ncnn, coreML, AscendCL, and RKNN.
Enables zero-copy operations between preprocessing and inference for improved end-to-end performance.
Offers flexible parallel execution strategies for optimizing throughput and latency.
Facilitates rapid demo construction with support for various input/output formats.

Maintenance & Community

The project is actively developing and welcomes contributions. A WeChat group is available for community discussion.

Licensing & Compatibility

The project's licensing is not explicitly stated in the README, which may pose a compatibility concern for commercial or closed-source integration.

Limitations & Caveats

The framework is described as being in its development stage. Some features, such as the memory pool and high-performance operators, are still under development. Support for certain models (e.g., Stable Diffusion, QWen, SAM) is also noted as "in progress."

Health Check

Last Commit

2 weeks ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

60 stars in the last 30 days