OnnxSlim by inisis

ONNX model optimization toolkit

Created 1 year ago

463 stars

Top 65.3% on SourcePulse

Project Summary

OnnxSlim is a toolkit designed to optimize ONNX models by reducing operator count, aiming to improve inference speed while preserving accuracy. It targets developers and researchers seeking to enhance the performance of their ONNX models for deployment across various platforms.

How It Works

OnnxSlim employs graph simplification techniques to reduce the complexity of ONNX models. Its core approach involves identifying and eliminating redundant or equivalent operators within the model graph. This process results in a more streamlined model representation, which directly translates to faster inference times without sacrificing the model's predictive accuracy.

Quick Start & Requirements

Installation:
- Prebuilt: pip install onnxslim
- From Source: pip install git+https://github.com/inisis/OnnxSlim@main
Usage:
- CLI: onnxslim <your_onnx_model> <slimmed_onnx_model>
- Python API: Load model using onnx.load, apply onnxslim.slim, and save using onnx.save.
Prerequisites: Python and ONNX library. No specific versions or hardware requirements (like GPU/CUDA) are mentioned.

Highlighted Details

Achieved 1st place in AICAS 2024 and 2025 LLM inference optimization challenges.
Reached 1 million downloads as of January 2025.
Key integrations include NVIDIA TensorRT-Model-Optimizer, HuggingFace Optimum, ultralytics, and transformers.js.
Reported a 5% performance increase when merged into MNN-LLM.

Maintenance & Community

Contact channels include Discord (https://discord.gg/nRw2Fd3VUS) and QQ Group (873569894).
Significant recent activity involves merging into major projects, suggesting a transition from standalone maintenance to integrated development.

Licensing & Compatibility

The project's license is not specified in the README. This lack of clarity presents a significant barrier for adoption, particularly in commercial contexts.

Limitations & Caveats

The project's primary development focus appears to have shifted towards its integration into larger frameworks like TensorRT-Model-Optimizer and HuggingFace Optimum. Standalone maintenance status is unclear.
The absence of a specified license creates ambiguity regarding usage rights and potential restrictions, especially for commercial use.
Detailed benchmarks or specific accuracy retention figures beyond challenge results are not provided.

Health Check

Last Commit

2 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

19 stars in the last 30 days