llm-export  by wangzhaode

CLI tool to export LLMs to ONNX and MNN

Created 2 years ago
308 stars

Top 87.1% on SourcePulse

GitHubView on GitHub
Project Summary

This tool addresses the need to convert large language models (LLMs) into ONNX and MNN formats for efficient deployment. It targets developers and researchers working with LLMs who require cross-platform compatibility and optimized inference, offering a streamlined process for model conversion and testing.

How It Works

The project leverages Hugging Face model repositories and provides a command-line interface for conversion. It supports dynamic shape optimization, constant folding, and ONNX model optimization via OnnxSlim, claiming up to 5% performance improvement. The tool also facilitates the export of LoRA weights and includes integrated testing capabilities for conversational and multimodal models.

Quick Start & Requirements

  • Install: pip install llmexport or pip install git+https://github.com/wangzhaode/llm-export@master
  • Prerequisites: Python, Git. Specific LLM model weights are required for conversion.
  • Usage: Download models via git clone from Hugging Face or ModelScope. Convert using llmexport --path <model_path> --export onnx or llmexport --path <model_path> --export mnn. Test with llmexport --path <model_path> --test "<query>".
  • Docs: https://github.com/wangzhaode/llm-export

Highlighted Details

  • Supports ONNX and MNN export formats.
  • Includes ONNX model optimization with OnnxSlim.
  • Facilitates LoRA weight merging and splitting.
  • Supports AWQ and GPTQ quantized model weights.

Maintenance & Community

The repository is actively maintained by wangzhaode. Community engagement details such as Discord/Slack channels are not explicitly mentioned in the README.

Licensing & Compatibility

The README does not specify a license. This lack of explicit licensing information may pose compatibility issues for commercial or closed-source use.

Limitations & Caveats

The project does not explicitly state its license, which could impact commercial adoption. While it lists many supported models, compatibility with all LLM architectures or specific quantization methods is not guaranteed.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Starred by Sasha Rush Sasha Rush(Research Scientist at Cursor; Professor at Cornell Tech) and Clément Renault Clément Renault(Cofounder of Meilisearch).

lm.rs by samuel-vitorino

0%
1k
Minimal LLM inference in Rust
Created 1 year ago
Updated 10 months ago
Feedback? Help us improve.