Discover and explore top open-source AI tools and projects—updated daily.
intelTransformer toolkit for GenAI/LLM acceleration on Intel platforms
Top 20.8% on SourcePulse
This toolkit accelerates Transformer-based models, particularly Large Language Models (LLMs), across Intel hardware (Gaudi2, CPUs, GPUs). It targets developers and researchers seeking to optimize LLM performance through advanced compression techniques and provides a customizable chatbot framework, NeuralChat.
How It Works
The extension integrates with Hugging Face Transformers, leveraging Intel® Neural Compressor for model compression. It employs advanced software optimizations and custom runtimes, including techniques from published research like "Fast Distilbert on CPUs" and "QuaLA-MiniLM," to achieve efficient inference and fine-tuning. It also offers a C/C++ inference engine with weight-only quantization kernels for Intel CPUs and GPUs.
Quick Start & Requirements
pip install intel-extension-for-transformersrequirements_cpu.txt, requirements_hpu.txt, and requirements_xpu.txt.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
1 year ago
Inactive
tunib-ai
ELS-RD
vllm-project
intel
huggingface
NVIDIA
openvinotoolkit