Discover and explore top open-source AI tools and projects—updated daily.
ML model compression resource list
Top 59.2% on SourcePulse
This repository curates resources for machine learning model compression and acceleration, targeting researchers and engineers seeking to reduce model size, improve inference speed, and lower computational costs. It provides a comprehensive collection of papers, tools, and tutorials covering techniques like quantization, pruning, and distillation.
How It Works
The collection categorizes research and tools by compression technique, including quantization (low-bit precision), pruning (removing weights/neurons), distillation (transferring knowledge to smaller models), and low-rank approximation. It highlights papers and libraries that implement these methods, often with a focus on efficiency for mobile and edge devices, as well as recent advancements in LLM compression.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
This is a curated list of research and tools, not a runnable software package. Users must independently evaluate and integrate the referenced papers and libraries. Some advanced techniques may require specific hardware (e.g., GPUs) or significant computational resources for implementation.
1 year ago
Inactive