open-metric-learning by OML-Team

PyTorch framework for training models producing high-quality embeddings

Created 3 years ago

983 stars

Top 37.6% on SourcePulse

View on GitHub

2 Experts Love This Project

Project Summary

This repository provides Open Metric Learning (OML), a PyTorch-based framework for training and validating models that produce high-quality embeddings for retrieval tasks. It targets researchers and engineers working with large-scale datasets where traditional classification methods fall short, offering a structured approach to metric learning with pre-trained models and practical pipelines.

How It Works

OML focuses on end-to-end pipelines and practical use cases, abstracting the complexities of metric learning. It leverages PyTorch Lightning for efficient training, especially with distributed data parallelism (DDP). The framework provides specialized samplers (e.g., CategoryBalanceSampler) and miners (e.g., HardTripletsMiner) to construct effective training batches, aiming to achieve state-of-the-art results with simpler heuristics compared to complex mathematical approaches.

Quick Start & Requirements

Install via pip: pip install -U open-metric-learning (with optional extras like [nlp], [audio], [pipelines]). Docker images are also available (omlteam/oml:gpu, omlteam/oml:cpu).
Supports Python 3.10-3.12.
Official documentation: https://open-metric-learning.readthedocs.io/en/latest/
Tutorials: English

Highlighted Details

Pipelines: Config-driven training and validation for images, texts, and audio.
Zoo: Access to pre-trained models for various modalities (e.g., ViT, ECAPA-TDNN) similar to torchvision.
Integration: Works with PyTorch Lightning and can be used with pure PyTorch.
State-of-the-Art Performance: Achieves competitive results on benchmarks like SOP and DeepFashion using custom samplers and miners.

Maintenance & Community

The project is actively maintained by the OML-Team, with contributions from university researchers and industry professionals. Community engagement channels are available via GitHub.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

ONNX export is not directly supported but can be achieved using PyTorch capabilities. The framework's primary focus is on PyTorch, and integration with other deep learning frameworks would require custom wrappers.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

5 stars in the last 30 days