Curated list for model quantization research
Top 21.2% on sourcepulse
This repository serves as a curated, comprehensive list of papers, documentation, and code related to model quantization, targeting researchers and practitioners in machine learning. It aims to consolidate information on techniques for reducing the precision of neural network weights and activations to improve efficiency, particularly for large models.
How It Works
The repository is structured as an "awesome list," categorizing resources by year, topic (e.g., binarization, LLM quantization), and specific benchmarks. It includes links to seminal papers, recent research, and associated code repositories, facilitating a broad overview and deep dive into the field of model quantization.
Quick Start & Requirements
This is a curated list, not a runnable software project. No installation or execution commands are applicable.
Highlighted Details
Maintenance & Community
The project is community-driven, with an open invitation for Pull Requests to add missing works. It is maintained by the "Efficient-ML" organization.
Licensing & Compatibility
The repository itself is licensed under the MIT License, allowing for broad reuse. Individual papers and code repositories linked within the list will have their own respective licenses.
Limitations & Caveats
As a curated list, the repository does not provide any executable code or direct functionality. Its value is entirely dependent on the completeness and accuracy of the community contributions.
5 months ago
Inactive