DyPE  by guyyariv

Diffusion models for ultra-high resolution image synthesis

Created 2 months ago
325 stars

Top 83.9% on SourcePulse

GitHubView on GitHub
Project Summary

DyPE (Dynamic Position Extrapolation) enables pre-trained diffusion transformers to generate ultra-high-resolution images far beyond their training scale. It dynamically adjusts positional encodings during denoising to match evolving frequency content, achieving faithful 4K × 4K results without retraining or extra sampling cost. This is beneficial for users needing to scale image generation beyond standard resolutions efficiently.

How It Works

The core approach involves dynamically adjusting positional encodings during the diffusion model's denoising process. This dynamic adjustment allows the model to adapt to evolving frequency content, enabling it to extrapolate to resolutions far exceeding its training data scale. This method is advantageous as it avoids the need for retraining or additional sampling steps, making high-resolution generation efficient.

Quick Start & Requirements

  • Installation: Create a conda environment (conda create -n dype python=3.10, conda activate dype) and install dependencies (pip install -r requirements.txt).
  • Prerequisites: Python 3.10.
  • Usage: python run_dype.py --prompt "Your text prompt here". Key arguments include --height, --width, --steps, --seed, --method (yarn, ntk, or base), and --no_dype.
  • Links: Project Page: https://noamissachar.github.io/DyPE/, arXiv Paper: https://arxiv.org/abs/2510.20766.

Highlighted Details

  • Enables generation of ultra-high-resolution images (e.g., 4K × 4K) from pre-trained diffusion transformers.
  • Achieves high-resolution output without retraining or extra sampling cost.
  • Dynamically adjusts positional encodings during denoising to match evolving frequency content.
  • Supports different position encoding methods: yarn, ntk, and base.

Maintenance & Community

No specific details on contributors, community channels, or roadmap are provided in the README.

Licensing & Compatibility

The work is patent pending. For commercial use or licensing inquiries, users must contact the authors. This implies that standard open-source licensing does not apply, and commercial use requires explicit permission.

Limitations & Caveats

The primary caveat is the patent-pending status and the requirement to contact authors for commercial use, which could be an adoption blocker for commercial projects. The README does not explicitly state other limitations like unsupported platforms or known bugs.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
2
Star History
18 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Zhiqiang Xie Zhiqiang Xie(Coauthor of SGLang), and
1 more.

Sana by NVlabs

0.5%
5k
Image synthesis research paper using a linear diffusion transformer
Created 1 year ago
Updated 3 weeks ago
Starred by Benjamin Bolte Benjamin Bolte(Cofounder of K-Scale Labs), Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral), and
10 more.

consistency_models by openai

0.1%
6k
PyTorch code for consistency models research paper
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.