RB-Modulation by google

PyTorch code for training-free diffusion model personalization

Created 1 year ago

403 stars

Top 72.1% on SourcePulse

Project Summary

This repository provides the official PyTorch implementation for RB-Modulation, a training-free method for personalizing diffusion models. It enables users to stylize images based on reference content and prompts, or compose reference content while preserving sample diversity and prompt alignment, targeting researchers and practitioners in generative AI.

How It Works

RB-Modulation leverages stochastic optimal control principles to achieve personalization without requiring model fine-tuning. It acts as a plug-and-play module, modulating diffusion model outputs based on reference images to control style and content composition. This approach aims to maintain sample diversity and adherence to user prompts.

Quick Start & Requirements

Installation: Requires cloning the repository, downloading pretrained StableCascade models, installing dependencies via requirements.txt, and installing LangSAM components.
Prerequisites: Python 3.9, PyTorch, CUDA (implied by StableCascade), LangSAM, GroundingDINO, Segment Anything.
Usage: Run via jupyter notebook rb-modulation.ipynb or launch a Gradio interface with python app.py after cloning the Hugging Face demo space.
Resources: Requires downloading significant pretrained models.
Links: Hugging Face Demo, Paper

Highlighted Details

Training-free personalization for diffusion models.
Supports stylization and content composition.
Maintains sample diversity and prompt alignment.
Includes a Hugging Face demo and Jupyter notebook for usability.

Maintenance & Community

The project is associated with Google and has recent updates indicating active development, including code release and paper publication.

Licensing & Compatibility

The repository is not an officially supported Google product. Licensing details are not explicitly stated in the README, but its association with Google suggests potential internal or permissive licensing. Compatibility for commercial use is not specified.

Limitations & Caveats

The README does not explicitly state licensing terms, which may impact commercial adoption. The setup involves multiple external dependencies and model downloads, potentially increasing complexity.

Health Check

Last Commit

11 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days