OnnxSlim  by inisis

ONNX model optimization toolkit

Created 1 year ago
437 stars

Top 68.2% on SourcePulse

GitHubView on GitHub
Project Summary

OnnxSlim is a toolkit designed to optimize ONNX models by reducing operator count, aiming to improve inference speed while preserving accuracy. It targets developers and researchers seeking to enhance the performance of their ONNX models for deployment across various platforms.

How It Works

OnnxSlim employs graph simplification techniques to reduce the complexity of ONNX models. Its core approach involves identifying and eliminating redundant or equivalent operators within the model graph. This process results in a more streamlined model representation, which directly translates to faster inference times without sacrificing the model's predictive accuracy.

Quick Start & Requirements

  • Installation:
    • Prebuilt: pip install onnxslim
    • From Source: pip install git+https://github.com/inisis/OnnxSlim@main
  • Usage:
    • CLI: onnxslim <your_onnx_model> <slimmed_onnx_model>
    • Python API: Load model using onnx.load, apply onnxslim.slim, and save using onnx.save.
  • Prerequisites: Python and ONNX library. No specific versions or hardware requirements (like GPU/CUDA) are mentioned.

Highlighted Details

  • Achieved 1st place in AICAS 2024 and 2025 LLM inference optimization challenges.
  • Reached 1 million downloads as of January 2025.
  • Key integrations include NVIDIA TensorRT-Model-Optimizer, HuggingFace Optimum, ultralytics, and transformers.js.
  • Reported a 5% performance increase when merged into MNN-LLM.

Maintenance & Community

  • Contact channels include Discord (https://discord.gg/nRw2Fd3VUS) and QQ Group (873569894).
  • Significant recent activity involves merging into major projects, suggesting a transition from standalone maintenance to integrated development.

Licensing & Compatibility

  • The project's license is not specified in the README. This lack of clarity presents a significant barrier for adoption, particularly in commercial contexts.

Limitations & Caveats

  • The project's primary development focus appears to have shifted towards its integration into larger frameworks like TensorRT-Model-Optimizer and HuggingFace Optimum. Standalone maintenance status is unclear.
  • The absence of a specified license creates ambiguity regarding usage rights and potential restrictions, especially for commercial use.
  • Detailed benchmarks or specific accuracy retention figures beyond challenge results are not provided.
Health Check
Last Commit

5 days ago

Responsiveness

Inactive

Pull Requests (30d)
22
Issues (30d)
7
Star History
127 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wei-Lin Chiang Wei-Lin Chiang(Cofounder of LMArena), and
13 more.

awesome-tensor-compilers by merrymercy

0%
3k
Curated list of tensor compiler projects and papers
Created 5 years ago
Updated 1 year ago
Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

BIG-bench by google

0.1%
3k
Collaborative benchmark for probing and extrapolating LLM capabilities
Created 5 years ago
Updated 1 year ago
Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
14 more.

simpletransformers by ThilinaRajapakse

0%
4k
Rapid NLP task implementation
Created 6 years ago
Updated 6 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

text-to-text-transfer-transformer by google-research

0.0%
6k
Unified text-to-text transformer for NLP research
Created 6 years ago
Updated 1 month ago
Starred by Vaibhav Nivargi Vaibhav Nivargi(Cofounder of Moveworks), Chuan Li Chuan Li(Chief Scientific Officer at Lambda), and
5 more.

awesome-mlops by visenger

0.1%
14k
Curated MLOps knowledge hub
Created 6 years ago
Updated 1 year ago
Feedback? Help us improve.