maestro  by roboflow

CLI/SDK for fine-tuning multimodal models

created 1 year ago
2,603 stars

Top 18.4% on sourcepulse

GitHubView on GitHub
Project Summary

Maestro is a Python library designed to simplify the fine-tuning of multimodal models, specifically targeting vision-language models (VLMs) like Florence-2, PaliGemma 2, and Qwen2.5-VL. It aims to accelerate the fine-tuning process for researchers and developers by providing a unified interface for configuration, data handling, and training loop setup, encapsulating best practices for reproducibility and efficiency.

How It Works

Maestro leverages a modular architecture, with each supported model having its own core training module. This approach allows for tailored optimization strategies such as LoRA and QLoRA, and features like graph freezing to manage hardware requirements. The library standardizes data handling through a JSONL format and offers a single CLI and Python API to abstract away complex setup, promoting a streamlined and consistent fine-tuning workflow.

Quick Start & Requirements

  • Install: pip install "maestro[paligemma_2]" (install model-specific dependencies).
  • Prerequisites: Python environment, model-specific dependencies.
  • Usage: CLI (maestro paligemma_2 train --dataset "dataset/location" ...) or Python API (from maestro.trainer.models.paligemma_2.core import train).
  • Resources: Colab notebooks are available for quick experimentation.

Highlighted Details

  • Supports fine-tuning of Florence-2, PaliGemma 2, and Qwen2.5-VL.
  • Integrates LoRA, QLoRA, and graph freezing for efficient training.
  • Provides a unified CLI and Python API for simplified workflow.
  • Uses a consistent JSONL format for data handling.

Maintenance & Community

The project is actively maintained by Roboflow. Community discussions and contributions are welcomed via GitHub Discussions. A Discord server is available for support and conversation.

Licensing & Compatibility

The specific license is not explicitly stated in the README. Compatibility for commercial use or closed-source linking would require clarification of the license terms.

Limitations & Caveats

Some fine-tuning recipes are marked as experimental (e.g., Florence-2 object detection, Qwen2.5-VL object detection). The README recommends creating dedicated Python environments for each model due to potential dependency conflicts.

Health Check
Last commit

5 days ago

Responsiveness

1 week

Pull Requests (30d)
6
Issues (30d)
1
Star History
60 stars in the last 90 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake) and Travis Fischer Travis Fischer(Founder of Agentic).

lingua by facebookresearch

0.1%
5k
LLM research codebase for training and inference
created 9 months ago
updated 2 weeks ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Nathan Lambert Nathan Lambert(AI Researcher at AI2), and
1 more.

tianshou by thu-ml

0.1%
9k
PyTorch RL library for algorithm development and application
created 7 years ago
updated 1 day ago
Feedback? Help us improve.