RB-Modulation  by google

PyTorch code for training-free diffusion model personalization

created 11 months ago
395 stars

Top 74.1% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official PyTorch implementation for RB-Modulation, a training-free method for personalizing diffusion models. It enables users to stylize images based on reference content and prompts, or compose reference content while preserving sample diversity and prompt alignment, targeting researchers and practitioners in generative AI.

How It Works

RB-Modulation leverages stochastic optimal control principles to achieve personalization without requiring model fine-tuning. It acts as a plug-and-play module, modulating diffusion model outputs based on reference images to control style and content composition. This approach aims to maintain sample diversity and adherence to user prompts.

Quick Start & Requirements

  • Installation: Requires cloning the repository, downloading pretrained StableCascade models, installing dependencies via requirements.txt, and installing LangSAM components.
  • Prerequisites: Python 3.9, PyTorch, CUDA (implied by StableCascade), LangSAM, GroundingDINO, Segment Anything.
  • Usage: Run via jupyter notebook rb-modulation.ipynb or launch a Gradio interface with python app.py after cloning the Hugging Face demo space.
  • Resources: Requires downloading significant pretrained models.
  • Links: Hugging Face Demo, Paper

Highlighted Details

  • Training-free personalization for diffusion models.
  • Supports stylization and content composition.
  • Maintains sample diversity and prompt alignment.
  • Includes a Hugging Face demo and Jupyter notebook for usability.

Maintenance & Community

The project is associated with Google and has recent updates indicating active development, including code release and paper publication.

Licensing & Compatibility

The repository is not an officially supported Google product. Licensing details are not explicitly stated in the README, but its association with Google suggests potential internal or permissive licensing. Compatibility for commercial use is not specified.

Limitations & Caveats

The README does not explicitly state licensing terms, which may impact commercial adoption. The setup involves multiple external dependencies and model downloads, potentially increasing complexity.

Health Check
Last commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
19 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.