fastembed by qdrant

Fast embedding SDK for text and images

Created 2 years ago

2,733 stars

Top 17.0% on SourcePulse

View on GitHub

6 Experts Love This Project

DevRel at Google DeepMind

Omar Sanseviero

DevRel at Google DeepMind

and 2 more!

Project Summary

FastEmbed is a lightweight, fast Python library for generating text, image, and multimodal embeddings using state-of-the-art models. It targets developers and researchers needing efficient embedding generation for applications like retrieval-augmented generation (RAG), semantic search, and recommendation systems, offering a faster and more memory-efficient alternative to larger libraries.

How It Works

FastEmbed leverages the ONNX Runtime for accelerated inference, enabling faster execution compared to PyTorch. It supports various embedding types, including dense text, sparse text (SPLADE++), late interaction (ColBERT), image, and multimodal embeddings. The library allows for easy model switching and custom model integration, with options for CPU and GPU acceleration.

Quick Start & Requirements

Install: pip install fastembed or pip install fastembed-gpu for GPU support.
Prerequisites: Python 3.7+, ONNX Runtime. GPU support requires compatible hardware and drivers (CUDA 12.x mentioned for GPU examples).
Docs: https://qdrant.github.io/fastembed/
Supported Models: https://qdrant.github.io/fastembed/examples/Supported_Models

Highlighted Details

Supports dense, sparse, late interaction, image, and multimodal embeddings.
Outperforms OpenAI Ada-002 and is faster/lighter than Transformers/Sentence-Transformers.
ONNX Runtime backend for speed and reduced dependencies.
GPU acceleration via fastembed-gpu package.
Integrates seamlessly with Qdrant client (qdrant-client[fastembed]).
Allows adding custom models via add_custom_model.

Maintenance & Community

Supported and maintained by Qdrant.
Active development indicated by frequent updates and model additions.

Licensing & Compatibility

Apache 2.0 License.
Permissive, allowing commercial use and integration with closed-source applications.

Limitations & Caveats

The library is primarily focused on ONNX-compatible models; other model formats may require conversion. While it aims for broad compatibility, specific model performance can vary.

Health Check

Last Commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

76 stars in the last 30 days