Awesome-Quantization-Papers by Zhen-Dong

Paper list for neural network quantization research

Created 4 years ago

777 stars

Top 45.1% on SourcePulse

Project Summary

This repository serves as a curated and actively updated catalog of research papers on neural network quantization, focusing on techniques for efficient deep learning inference. It targets researchers, engineers, and practitioners in AI and machine learning who need to stay abreast of the latest advancements in model compression and optimization. The primary benefit is a structured overview of quantization methods, categorized by model architecture and application, facilitating targeted research and development.

How It Works

The repository organizes papers by conference (e.g., ICLR, NeurIPS, CVPR) and model type (e.g., Transformers, CNNs, Diffusion Models, Vision Transformers). Each entry includes a link to the paper and often keywords indicating the quantization approach (e.g., PTQ for Post-Training Quantization, Extreme for binary/ternary quantization, MP for mixed-precision). This structured approach allows users to quickly identify relevant research and understand the landscape of quantization techniques.

Quick Start & Requirements

This is a curated list of papers; no installation or execution is required. The content is accessible via the GitHub repository.

Highlighted Details

Comprehensive coverage of recent AI conferences and arXiv preprints.
Categorization by model architecture (LLMs, ViTs, CNNs, Diffusion Models) and task (Image Classification, Object Detection, Super Resolution).
Keywords and labels (PTQ, Non-uniform, Extreme, MP) to quickly identify quantization methods.
Active updates, with recent additions from ICLR-25, ECCV-24, NeurIPS-24, ICML-24, and CVPR-24.

Maintenance & Community

The repository is actively maintained and welcomes contributions to expand its scope. It acknowledges collaborators and encourages community engagement through starring and sharing.

Licensing & Compatibility

The repository itself is typically licensed under permissive terms (e.g., MIT, Apache 2.0) allowing broad use and contribution. The linked papers are subject to their respective publication licenses.

Limitations & Caveats

This repository is a bibliography and does not provide code implementations or benchmarks for the papers listed. Users must refer to individual papers for implementation details and performance validation.

Awesome-Quantization-Papers by Zhen-Dong

Explore Similar Projects

EfficientQAT by OpenGVLab

LLM-QAT by facebookresearch

History-of-Deep-Learning by saurabhaloneai

aimet-model-zoo by quic

dl_note by harleyszhang

gptq by IST-DASLab

smoothquant by mit-han-lab

Awesome-Model-Quantization by Efficient-ML

neural-compressor by intel

ao by pytorch

models by onnx

pytorch-tutorial by yunjey