onediff by siliconflow

Acceleration library for diffusion models

Created 3 years ago

1,961 stars

Top 22.2% on SourcePulse

2 Experts Love This Project

hiyouga

Author of LLaMA-Factory

luiscape

Cofounder of Lightning AI

Project Summary

OneDiff is an acceleration library designed to speed up diffusion models for users of popular UIs like ComfyUI and Hugging Face Diffusers. It offers PyTorch code compilation and optimized GPU kernels, aiming to provide significant performance gains with minimal code changes.

How It Works

OneDiff leverages PyTorch module compilation, specifically through its OneFlow backend or the optional Nexfort compiler. This process compiles PyTorch code into optimized kernels, reducing overhead from dynamic Python execution and enabling faster inference. The compilation can be done offline and the results loaded for online serving, supporting dynamic input shapes without recompilation penalties.

Quick Start & Requirements

Installation: python3 -m pip install --pre onediff (or from source for plugins).
Prerequisites: PyTorch, Hugging Face Diffusers, and a compiler backend (OneFlow or Nexfort). Requires NVIDIA GPUs (3090 RTX/4090 RTX/A100/A800/A10 etc.). CUDA 11.8, 12.1, or 12.2 are supported for OneFlow.
Setup: Installation involves installing PyTorch, diffusers, a compiler backend, and OneDiff itself.
Documentation: Documentation

Highlighted Details

Up to 1.7x speedup reported for Kolors, DiT, SD3, PixArt, and Latte models.
Supports acceleration for SD 1.5-XL, SDXL Turbo, LCM, LoRA, ControlNet, SVD, and InstantID.
Integrates with ComfyUI, Hugging Face Diffusers, and Stable Diffusion web UI.
Offers features like dynamic image size support and fast LoRA switching.

Maintenance & Community

Active development with recent updates for DiT and Kolors acceleration.
Community support via Discord and GitHub Issues.
Discord

Licensing & Compatibility

The repository does not explicitly state a license in the README.

Limitations & Caveats

Windows support is limited to WSL. Compatibility with Ascend GPUs is in progress. Some features like SDXL DeepCache are in alpha status.

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

9 stars in the last 30 days

Explore Similar Projects

Starred by

Philipp Schmid

Philipp Schmid(DevRel at Google DeepMind),

Zhiqiang Xie

Zhiqiang Xie(Coauthor of SGLang), and

1 more.

distrifuser by mit-han-lab

Research paper for distributed parallel inference of high-resolution diffusion models

Created 1 year ago

Updated 1 year ago

Starred by

Andreas Jansson

Andreas Jansson(Cofounder of Replicate).

flux-fp8-api by aredden

FastAPI for text-to-image diffusion using FP8

Created 1 year ago

Updated 1 year ago

Starred by

Alex Yu

Alex Yu(Research Scientist at OpenAI; Cofounder of Luma AI).

DeepCache by horseee

Training-free acceleration for diffusion models

Created 2 years ago

Updated 1 year ago

Starred by

Luca Soldaini

Luca Soldaini(Research Scientist at Ai2),

Edward Sun

Edward Sun(Research Scientist at Meta Superintelligence Lab), and

4 more.

parallelformers by tunib-ai

Toolkit for easy model parallelization

Created 4 years ago

Updated 2 years ago

Starred by

Abubakar Abid

Abubakar Abid(Cofounder of Gradio).

Radiata by ddPn08

WebUI for stable diffusion, built on diffusers

Created 2 years ago

Updated 2 years ago

Starred by

Michael Han

Michael Han(Cofounder of Unsloth) and

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

pruna by PrunaAI

Model optimization framework for faster, smaller, cheaper, greener AI

Created 10 months ago

Updated 2 days ago

Starred by

Patrick von Platen

Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral) and

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI).

tomesd by dbolya

Speed-up tool for Stable Diffusion

Created 2 years ago

Updated 2 years ago

Starred by

Luis Capelo

Luis Capelo(Cofounder of Lightning AI),

Chaoyu Yang

Chaoyu Yang(Founder of Bento), and

4 more.

nunchaku by nunchaku-ai

High-performance 4-bit diffusion model inference engine

Created 1 year ago

Updated 14 hours ago

Starred by

Alex Yu

Alex Yu(Research Scientist at OpenAI; Cofounder of Luma AI) and

Cody Yu

Cody Yu(Coauthor of vLLM; MTS at OpenAI).

xDiT by xdit-project

Inference engine for parallel Diffusion Transformer (DiT) deployment

Created 1 year ago

Updated 2 days ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Alex Chen

Alex Chen(Cofounder of Nexa AI), and

7 more.

stable-diffusion.cpp by leejet

C/C++ inference for Stable Diffusion models

Created 2 years ago

Updated 14 hours ago

Starred by

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind),

Abubakar Abid

Abubakar Abid(Cofounder of Gradio), and

1 more.

sdnext by vladmandic

WebUI for AI generative image and video creation

Created 3 years ago

Updated 1 day ago

Starred by

Nat Friedman

Nat Friedman(Former CEO of GitHub),

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and

15 more.

FasterTransformer by NVIDIA

Optimized transformer library for inference

Created 4 years ago

Updated 1 year ago

Feedback? Help us improve.