SeedVR  by ByteDance-Seed

Diffusion models for advanced video restoration

Created 5 months ago
766 stars

Top 45.5% on SourcePulse

GitHubView on GitHub
Project Summary

SeedVR and SeedVR2 address limitations in video restoration (VR) by leveraging advanced diffusion transformer architectures and novel training techniques. SeedVR enables arbitrary-resolution restoration without relying on pre-trained diffusion priors, while SeedVR2 achieves one-step video restoration through diffusion adversarial post-training. These projects target researchers and practitioners seeking to improve video quality and inference efficiency, particularly for real-world and AI-generated content, offering a significant advancement over conventional and patch-based diffusion methods.

How It Works

SeedVR employs a diffusion transformer model designed for generic video restoration, capable of handling arbitrary resolutions by integrating advanced video generation technologies. This approach avoids the constraints and biases of pre-trained diffusion priors and the slow inference speeds associated with patch-based sampling in existing diffusion models. SeedVR2 builds upon this by introducing a one-step restoration process using diffusion adversarial post-training. Key enhancements include an adaptive window attention mechanism that dynamically adjusts to output resolutions, ensuring consistency, and a feature matching loss to stabilize adversarial training, thereby improving efficiency and temporal consistency for high-resolution video.

Quick Start & Requirements

Setup involves cloning the repository (https://github.com/bytedance-seed/SeedVR.git), creating a conda environment with Python 3.10, and installing dependencies via pip install -r requirements.txt, flash_attn==2.5.9.post1, and apex (pre-built wheels provided for specific CUDA/PyTorch versions). Pretrained checkpoints are available on Hugging Face (e.g., ByteDance-Seed/SeedVR2-3B). Inference requires significant GPU resources; for example, one H100-80G can process videos up to 100x720x1280, with 4 H100-80G supporting 1080p and 2K resolutions using sequence parallelism.

Highlighted Details

  • Presents SeedVR as the "largest-ever diffusion transformer model towards generic video restoration."
  • SeedVR2 offers "one-step video restoration" with performance comparable or superior to existing methods.
  • Both projects have been recognized, with SeedVR highlighted at CVPR 2025.
  • The codebase includes scripts for environment setup, checkpoint downloading, and inference.

Maintenance & Community

The repository was created on June 11, 2025, and acknowledges support from the open community. No specific community channels (e.g., Discord, Slack) or roadmap links are provided in the README.

Licensing & Compatibility

SeedVR and SeedVR2 are released under the Apache 2.0 license, permitting commercial use and modification.

Limitations & Caveats

The provided models are prototypes, and their performance may not perfectly match the published papers. The methods exhibit limited robustness against heavy degradations and large motions, sharing failure cases with existing approaches. Furthermore, their strong generation capabilities can lead to over-generation of details and occasional over-sharpening on lightly degraded videos, particularly at lower resolutions (e.g., 480p).

Health Check
Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
8
Star History
76 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wei-Lin Chiang Wei-Lin Chiang(Cofounder of LMArena), and
13 more.

awesome-tensor-compilers by merrymercy

0.4%
3k
Curated list of tensor compiler projects and papers
Created 5 years ago
Updated 1 year ago
Starred by Shengjia Zhao Shengjia Zhao(Chief Scientist at Meta Superintelligence Lab), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
14 more.

BIG-bench by google

0.2%
3k
Collaborative benchmark for probing and extrapolating LLM capabilities
Created 4 years ago
Updated 1 year ago
Starred by Lysandre Debut Lysandre Debut(Chief Open-Source Officer at Hugging Face), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
14 more.

simpletransformers by ThilinaRajapakse

0.0%
4k
Rapid NLP task implementation
Created 6 years ago
Updated 3 months ago
Starred by Aravind Srinivas Aravind Srinivas(Cofounder of Perplexity), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
16 more.

text-to-text-transfer-transformer by google-research

0.1%
6k
Unified text-to-text transformer for NLP research
Created 6 years ago
Updated 3 weeks ago
Starred by Vaibhav Nivargi Vaibhav Nivargi(Cofounder of Moveworks), Chuan Li Chuan Li(Chief Scientific Officer at Lambda), and
5 more.

awesome-mlops by visenger

0.1%
13k
Curated MLOps knowledge hub
Created 5 years ago
Updated 1 year ago
Feedback? Help us improve.