quaterion  by qdrant

Framework for fine-tuning similarity learning models

Created 4 years ago
657 stars

Top 50.9% on SourcePulse

GitHubView on GitHub
Project Summary

Quaterion is a Python framework designed to fine-tune similarity learning models for tasks like semantic search, recommendations, and anomaly detection. It targets users needing to adapt pre-trained models to custom tasks efficiently, especially with limited data or computational resources, by avoiding slow, costly full model retraining.

How It Works

Quaterion leverages a caching mechanism to enable training with large batch sizes on modest hardware, significantly speeding up the fine-tuning process. It incorporates specially designed head layers that allow pre-trained models to specialize effectively even with small, custom datasets, mitigating the risk of catastrophic forgetting. The framework is built on PyTorch Lightning, inheriting its scalability, cost-efficiency, and reliability for complex training pipelines.

Quick Start & Requirements

  • Training: pip install quaterion
  • Inference Service: pip install quaterion-models
  • Prerequisites: PyTorch, PyTorch Lightning. GPU recommended for training.
  • Docs: https://github.com/qdrant-tech/quaterion
  • Tutorials: Fine-tuning NLP models, Fine-tuning CV models.

Highlighted Details

  • Enables training thousands of epochs with large batch sizes on laptop GPUs via built-in caching.
  • Compatible with small datasets, allowing effective fine-tuning with custom head layers.
  • Highly customizable architecture for sophisticated training pipelines.
  • Built on PyTorch Lightning for scalability and reliability.

Maintenance & Community

  • Community support via Discord.
  • Active development indicated by recent commits and releases.
  • Contact: info@qdrant.tech

Licensing & Compatibility

  • Licensed under Apache License, Version 2.0.
  • Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The quaterion-models package is separate to minimize inference dependencies, requiring users to install the appropriate package based on their needs. Specific hardware requirements beyond a GPU for optimal training performance are not detailed.

Health Check
Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 30 days

Explore Similar Projects

Starred by Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), Lewis Tunstall Lewis Tunstall(Research Engineer at Hugging Face), and
15 more.

torchtune by pytorch

0.2%
5k
PyTorch library for LLM post-training and experimentation
Created 1 year ago
Updated 1 day ago
Starred by Tobi Lutke Tobi Lutke(Cofounder of Shopify), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
26 more.

ColossalAI by hpcaitech

0.1%
41k
AI system for large-scale parallel training
Created 3 years ago
Updated 13 hours ago
Feedback? Help us improve.