aimet-model-zoo  by quic

Model zoo for quantized neural network performance analysis

Created 4 years ago
334 stars

Top 82.1% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive model zoo for the AI Model Efficiency Toolkit (AIMET), showcasing popular neural network models quantized using AIMET's techniques. It targets researchers and engineers working on optimizing models for edge devices, offering a benchmark of floating-point versus quantized performance and providing scripts for quantization.

How It Works

The zoo demonstrates model quantization using AIMET, a toolkit supporting state-of-the-art techniques like Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT). For each model, it provides links to original FP32 checkpoints and quantized versions, along with evaluation scripts that perform quantization and report accuracy metrics. This allows users to directly compare performance and leverage AIMET's capabilities.

Quick Start & Requirements

  • Install AIMET and its dependencies following the official Installation instructions.
  • Install the AIMET model zoo Python package(s).
  • Run evaluation scripts for specific models, referencing .md files in TensorFlow or PyTorch subfolders for detailed procedures.

Highlighted Details

  • Extensive coverage across PyTorch and TensorFlow frameworks.
  • Models span various tasks: Image Classification, Object Detection, Pose Estimation, Super Resolution, Semantic Segmentation, Video Understanding, Speech Recognition, and NLP/NLU.
  • Quantization levels include W8A8 (8-bit weights, 8-bit activations) and W4A8 (4-bit weights, 8-bit activations), with some models using mixed precision.
  • Performance comparisons (accuracy, mAP, mIOU, WER, GLUE score, etc.) are provided for FP32 and quantized models.

Maintenance & Community

  • Maintained by Qualcomm Innovation Center, Inc.
  • Links to specific documentation and installation guides are provided within the README.

Licensing & Compatibility

  • The license is specified as available in the LICENSE file. No specific license type is mentioned in the README.

Limitations & Caveats

  • "TBD" (To Be Determined) entries in the results tables indicate that support or specific results are not yet available for those configurations.
Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Zack Li Zack Li(Cofounder of Nexa AI), and
4 more.

smoothquant by mit-han-lab

0.3%
2k
Post-training quantization research paper for large language models
Created 2 years ago
Updated 1 year ago
Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

neural-compressor by intel

0.2%
2k
Python library for model compression (quantization, pruning, distillation, NAS)
Created 5 years ago
Updated 14 hours ago
Feedback? Help us improve.