sparsify by neuralmagic

ML model optimization for faster inference via sparsification

Created 5 years ago

325 stars

Top 83.9% on SourcePulse

Project Summary

Sparsify is an ML model optimization product designed to accelerate inference through pruning, quantization, and distillation. It targets ML engineers and researchers seeking to improve model performance without significant accuracy loss, offering both a web application and a CLI/API for managing and running optimization experiments.

How It Works

Sparsify applies state-of-the-art optimization techniques via three experiment types: One-Shot (post-training pruning), Sparse-Transfer (leveraging pre-sparsified models), and Training-Aware (sparsification during training). These methods aim to achieve significant speedups (3-12x) with minimal accuracy degradation. The system integrates with Sparsify Cloud for hyperparameter tuning and result comparison, and the CLI/API for local execution and workflow integration.

Quick Start & Requirements

Install: pip install sparsify-nightly
Prerequisites: Python 3.8/3.10, ONNX 1.5.0-1.12.0, ONNX opset 11+, manylinux compliant systems. Requires a GPU with CUDA + CuDNN (minimum 16GB VRAM recommended). Linux OS is required; Windows/macOS are not supported. A Neural Magic account is needed for API key authorization.
Resources: Minimum 128GB RAM, 4 CPU cores. Large models may require more RAM.
Docs: Quickstart Guide

Highlighted Details

Offers 3x-5x speedup with One-Shot, 5x-10x with Sparse-Transfer, and 6x-12x with Training-Aware experiments.
Supports CV and NLP use cases, with a current focus on LLMs.
Integrates with DeepSparse for optimized CPU inference.
Models must be in ONNX format for One-Shot, and PyTorch for Sparse-Transfer/Training-Aware.

Maintenance & Community

The project is currently in Alpha, with development paused for a new LLM-focused project. Non-LLM pathways (CV, NLP) will not receive further bug fixes or feature development.
Community support is available via Neural Magic Slack Channel and GitHub Issues.

Licensing & Compatibility

Licensed under the Apache License Version 2.0.
Compatible with commercial use.

Limitations & Caveats

Sparsify is in Alpha and not production-ready; APIs and UIs are subject to change.
Development focus has shifted to LLMs, with existing CV/NLP pathways no longer actively supported.
Requires specific hardware (GPU, CUDA) and OS (Linux), limiting broader adoption.

Health Check

Last Commit

7 months ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days