optimum  by huggingface

Hardware optimization tools for Transformers, Diffusers, etc

created 4 years ago
3,001 stars

Top 16.3% on sourcepulse

GitHubView on GitHub
Project Summary

Hugging Face Optimum provides a unified toolkit for accelerating the inference and training of Hugging Face Transformers, Diffusers, TIMM, and Sentence Transformers models across diverse hardware backends. It targets researchers and engineers seeking to maximize model performance on specific hardware without sacrificing ease of use.

How It Works

Optimum acts as an extension layer, abstracting hardware-specific optimizations. It supports exporting models to various formats like ONNX, ExecuTorch, TensorFlow Lite, and OpenVINO, enabling efficient execution via optimized runtimes. For training, it offers wrappers around the Hugging Face Trainer to leverage specialized hardware accelerators.

Quick Start & Requirements

  • Install core package: python -m pip install optimum
  • Install accelerator-specific features: python -m pip install optimum[accelerator_type] (e.g., optimum[onnxruntime], optimum[executorch], optimum[neural-compressor], optimum[openvino], optimum[amd], optimum[neuronx], optimum[habana]).
  • NVIDIA TensorRT-LLM requires Docker: docker run -it --gpus all --ipc host huggingface/optimum-nvidia.
  • Full documentation: https://huggingface.co/docs/optimum/index

Highlighted Details

  • Supports ONNX, ExecuTorch, TensorFlow Lite, OpenVINO, Intel Neural Compressor, NVIDIA TensorRT-LLM, AMD Instinct, AWS Trainium/Inferentia, and Habana Gaudi.
  • Enables programmatic and CLI-based model export and optimization.
  • Provides wrappers for accelerated training on specialized hardware.
  • Integrates with PyTorch's native edge solution (ExecuTorch) and quantization tools like Quanto.

Maintenance & Community

  • Developed by Hugging Face.
  • Community support channels and documentation are available via Hugging Face.

Licensing & Compatibility

  • Primarily Apache 2.0 licensed, consistent with Hugging Face ecosystem.
  • Compatible with commercial and closed-source applications.

Limitations & Caveats

The README indicates that specific accelerator integrations might require separate installation steps or Docker images, and users should consult the documentation for detailed prerequisites for each backend. Some integrations may be in earlier stages of development.

Health Check
Last commit

1 day ago

Responsiveness

1 week

Pull Requests (30d)
20
Issues (30d)
19
Star History
146 stars in the last 90 days

Explore Similar Projects

Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

ctransformers by marella

0.1%
2k
Python bindings for fast Transformer model inference
created 2 years ago
updated 1 year ago
Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
6 more.

gpt-neox by EleutherAI

0.1%
7k
Framework for training large-scale autoregressive language models
created 4 years ago
updated 1 week ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Eugene Yan Eugene Yan(AI Scientist at AWS), and
10 more.

accelerate by huggingface

0.2%
9k
PyTorch training helper for distributed execution
created 4 years ago
updated 2 days ago
Feedback? Help us improve.