open-metric-learning  by OML-Team

PyTorch framework for training models producing high-quality embeddings

created 3 years ago
968 stars

Top 38.9% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides Open Metric Learning (OML), a PyTorch-based framework for training and validating models that produce high-quality embeddings for retrieval tasks. It targets researchers and engineers working with large-scale datasets where traditional classification methods fall short, offering a structured approach to metric learning with pre-trained models and practical pipelines.

How It Works

OML focuses on end-to-end pipelines and practical use cases, abstracting the complexities of metric learning. It leverages PyTorch Lightning for efficient training, especially with distributed data parallelism (DDP). The framework provides specialized samplers (e.g., CategoryBalanceSampler) and miners (e.g., HardTripletsMiner) to construct effective training batches, aiming to achieve state-of-the-art results with simpler heuristics compared to complex mathematical approaches.

Quick Start & Requirements

  • Install via pip: pip install -U open-metric-learning (with optional extras like [nlp], [audio], [pipelines]). Docker images are also available (omlteam/oml:gpu, omlteam/oml:cpu).
  • Supports Python 3.10-3.12.
  • Official documentation: https://open-metric-learning.readthedocs.io/en/latest/
  • Tutorials: English

Highlighted Details

  • Pipelines: Config-driven training and validation for images, texts, and audio.
  • Zoo: Access to pre-trained models for various modalities (e.g., ViT, ECAPA-TDNN) similar to torchvision.
  • Integration: Works with PyTorch Lightning and can be used with pure PyTorch.
  • State-of-the-Art Performance: Achieves competitive results on benchmarks like SOP and DeepFashion using custom samplers and miners.

Maintenance & Community

The project is actively maintained by the OML-Team, with contributions from university researchers and industry professionals. Community engagement channels are available via GitHub.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

ONNX export is not directly supported but can be achieved using PyTorch capabilities. The framework's primary focus is on PyTorch, and integration with other deep learning frameworks would require custom wrappers.

Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
26 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.