mlx-embeddings  by Blaizzy

Local multimodal embedding generation for Mac

Created 1 year ago
282 stars

Top 92.7% on SourcePulse

GitHubView on GitHub
Project Summary

MLX-Embeddings provides a Python package for running vision and language embedding models locally on macOS, leveraging Apple's MLX framework for efficient on-device processing. It targets developers and researchers needing to generate embeddings for text and images without relying on cloud services, offering a streamlined solution for local machine learning tasks on Apple Silicon hardware.

How It Works

The package utilizes Apple's MLX framework to enable local execution of various embedding models. It supports a range of text embedding architectures, including XLM-RoBERTa, BERT, ModernBERT, and Qwen3, alongside vision models like SigLIP and multimodal retrieval models such as ColPali/ColQwen. The core approach focuses on providing efficient APIs for single-item and batch processing, facilitating tasks like text similarity comparison and image-text matching directly on Mac hardware.

Quick Start & Requirements

  • Installation: pip install mlx-embeddings
  • Prerequisites: Requires macOS with Apple Silicon for MLX compatibility. No other specific hardware (like CUDA) or software versions are explicitly detailed beyond standard Python environments.
  • Resources: Local execution implies resource consumption based on model size and data volume.
  • Links: GitHub repository (implied by Blaizzy/mlx-embeddings), official documentation (not directly linked).

Highlighted Details

  • Comprehensive support for diverse text embedding models (XLM-RoBERTa, BERT, ModernBERT, Qwen3).
  • Integration of vision and multimodal models (SigLIP, ColPali/ColQwen) for image-text tasks.
  • Robust batch processing capabilities for efficient handling of multiple inputs.
  • Built-in utilities for calculating and visualizing similarity scores between embeddings.
  • Examples demonstrate Masked Language Modeling and sequence classification tasks.

Maintenance & Community

Contributions are welcomed, with guidelines available on the GitHub repository. For inquiries or issues, users are directed to open GitHub issues. No specific community channels (e.g., Discord, Slack) or details on core maintainers/sponsorships are provided in the README.

Licensing & Compatibility

The project is licensed under the GNU General Public License v3.0 (GPL-3.0). This is a strong copyleft license, requiring any derivative works or software that links to this library to also be distributed under the GPL-3.0 terms, which may impose restrictions on commercial or proprietary use.

Limitations & Caveats

The package is explicitly designed for macOS and Apple Silicon hardware due to its reliance on MLX. Support for additional model architectures is ongoing, indicating that the current list may evolve. The README does not specify alpha/beta status, but the continuous development suggests potential for ongoing changes.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
4
Issues (30d)
1
Star History
19 stars in the last 30 days

Explore Similar Projects

Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Travis Fischer Travis Fischer(Founder of Agentic), and
5 more.

fromage by kohjingyu

0%
486
Multimodal model for grounding language models to images
Created 3 years ago
Updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind).

gill by kohjingyu

0%
471
Multimodal LLM for generating/retrieving images and generating text
Created 2 years ago
Updated 2 years ago
Feedback? Help us improve.