optimum by huggingface

Hardware optimization tools for Transformers, Diffusers, etc

Created 4 years ago

3,245 stars

Top 14.8% on SourcePulse

View on GitHub

15 Experts Love This Project

Clement Delangue

Cofounder of Hugging Face

Piotr Dąbkowski

Cofounder of ElevenLabs

Andrew Kane

Author of pgvector

Luca Soldaini

Research Scientist at Ai2

and 11 more!

Project Summary

Hugging Face Optimum provides a unified toolkit for accelerating the inference and training of Hugging Face Transformers, Diffusers, TIMM, and Sentence Transformers models across diverse hardware backends. It targets researchers and engineers seeking to maximize model performance on specific hardware without sacrificing ease of use.

How It Works

Optimum acts as an extension layer, abstracting hardware-specific optimizations. It supports exporting models to various formats like ONNX, ExecuTorch, TensorFlow Lite, and OpenVINO, enabling efficient execution via optimized runtimes. For training, it offers wrappers around the Hugging Face Trainer to leverage specialized hardware accelerators.

Quick Start & Requirements

Install core package: python -m pip install optimum
Install accelerator-specific features: python -m pip install optimum[accelerator_type] (e.g., optimum[onnxruntime], optimum[executorch], optimum[neural-compressor], optimum[openvino], optimum[amd], optimum[neuronx], optimum[habana]).
NVIDIA TensorRT-LLM requires Docker: docker run -it --gpus all --ipc host huggingface/optimum-nvidia.
Full documentation: https://huggingface.co/docs/optimum/index

Highlighted Details

Supports ONNX, ExecuTorch, TensorFlow Lite, OpenVINO, Intel Neural Compressor, NVIDIA TensorRT-LLM, AMD Instinct, AWS Trainium/Inferentia, and Habana Gaudi.
Enables programmatic and CLI-based model export and optimization.
Provides wrappers for accelerated training on specialized hardware.
Integrates with PyTorch's native edge solution (ExecuTorch) and quantization tools like Quanto.

Maintenance & Community

Developed by Hugging Face.
Community support channels and documentation are available via Hugging Face.

Licensing & Compatibility

Primarily Apache 2.0 licensed, consistent with Hugging Face ecosystem.
Compatible with commercial and closed-source applications.

Limitations & Caveats

The README indicates that specific accelerator integrations might require separate installation steps or Docker images, and users should consult the documentation for detailed prerequisites for each backend. Some integrations may be in earlier stages of development.

Health Check

Last Commit

3 weeks ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

44 stars in the last 30 days