Olympus by yuanze-lin

CV research paper for universal task routing in computer vision

Created 1 year ago

427 stars

Top 69.3% on SourcePulse

Project Summary

Olympus provides a universal task router for computer vision, enabling a single model to handle diverse tasks like image generation, 3D model creation, and video synthesis. It is designed for researchers and developers working with multimodal AI systems who need a unified approach to orchestrate complex visual workflows.

How It Works

Olympus acts as a router, interpreting complex, multi-step prompts and directing them to appropriate vision-language models. It leverages a chain-of-action approach, breaking down user requests into sequential sub-tasks. This allows for a more flexible and powerful interaction model, where a single input can trigger a cascade of specialized visual operations.

Quick Start & Requirements

Install: Clone the repository, create a conda environment (conda create -n olympus python==3.10 -y), activate it (conda activate olympus), and install dependencies (pip install -r requirements.txt).
Prerequisites: Python 3.10, Conda.
Models & Data: Download Olympus model (python download_olympus.py), fine-tuning data (python download_olympus_dataset.py), and Mipha-3B model (python download_mipha_3b.py).
Resources: Requires downloading several model and dataset files.
Docs: Evaluation.md

Highlighted Details

CVPR 2025 Highlight paper.
Supports 20 distinct computer vision tasks.
Built upon Mipha and LLaVA projects.
Includes official code, datasets, and models.

Maintenance & Community

Project is actively maintained with code, training, inference, datasets, and models released.
Built on Mipha and LLaVA.

Licensing & Compatibility

License not explicitly stated in the README.
Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The license is not specified, which may impact commercial adoption. The README does not detail specific hardware requirements beyond standard Python environments.

Health Check

Last Commit

9 months ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days