ONNX-MLIR provides a compiler infrastructure to transform ONNX graphs into optimized code for various targets, including CPUs and specialized accelerators. It targets researchers and developers needing to deploy ONNX models efficiently with minimal runtime dependencies, offering flexibility in output formats like LLVM IR, object files, or shared libraries.
How It Works
The project leverages the MLIR (Multi-Level Intermediate Representation) compiler framework and its LLVM backend. It defines an ONNX dialect within MLIR to represent ONNX graphs directly. This allows for staged lowering: ONNX graphs are first translated into the ONNX MLIR dialect, then progressively lowered through various MLIR dialects (e.g., affine, SCF, LLVM) to LLVM IR, which is then compiled to native code. This approach enables sophisticated compiler optimizations and target-specific code generation.
Quick Start & Requirements
- Install/Run: Prebuilt Docker images are the recommended approach. Direct setup requires Python >= 3.8, GCC >= 6.4, Protobuf >= 4.21.12, CMake >= 3.13.4, Make >= 4.2.1 or Ninja >= 1.10.2, and optionally Java >= 1.11.
- Dependencies: Relies on specific commits of the LLVM project.
- Resources: Building from source can be complex due to dependency management.
- Docs: https://onnx.ai/onnx-mlir/
Highlighted Details
- Supports lowering ONNX graphs to MLIR, LLVM IR, object files, shared libraries, and JNI libraries.
- Offers multiple optimization levels (-O0 to -O3).
- Includes an ONNX dialect for direct integration into other MLIR projects.
- Used by IBM zDLC compiler for IBM Telum servers.
Maintenance & Community
- Active community via a Slack channel (#onnx-mlir-discussion) and GitHub Issues.
- Weekly informal meetings for discussions.
- Requires DCO signing for contributions.
- Code of Conduct
Licensing & Compatibility
- The project appears to be primarily licensed under Apache 2.0, but specific components or dependencies might have different licenses. The README does not explicitly state a single license for the entire project. Compatibility for commercial use is generally good under Apache 2.0.
Limitations & Caveats
- Dependency on specific LLVM commits can complicate upgrades.
- Support for ONNX operations varies and is detailed for generic CPUs and IBM Telum accelerators.
- Setup can be tricky without Docker due to complex dependencies.