Helios  by PKU-YuanGroup

Breakthrough in real-time long video generation

Created 1 week ago

New!

1,080 stars

Top 35.1% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Helios is a 14B parameter model for real-time, long-form video generation, offering superior efficiency and quality compared to smaller models. It targets researchers and developers needing high-quality, minute-scale video synthesis at high frame rates, providing a significant performance boost for generative AI.

How It Works

Helios generates video autoregressively in 33-frame chunks, achieving high temporal coherence for minute-long videos without conventional anti-drifting strategies (e.g., self-forcing, keyframe sampling). It also bypasses standard acceleration techniques like KV-caching or quantization, yet delivers 19.5 FPS on a single H100 GPU. This design prioritizes end-to-end inference efficiency and reduced memory, enabling larger training batches and fitting multiple models within limited VRAM.

Quick Start & Requirements

Installation requires cloning the repo, setting up a Python 3.11.2 conda environment, and installing PyTorch with specific CUDA versions (11.8, 12.6, or 12.8). Dependencies are installed via bash install.sh. High-performance inference is demonstrated on a single NVIDIA H100 GPU. Integrations with Diffusers, vLLM-Omni, and SGLang-Diffusion require installation from their respective GitHub repositories.

Highlighted Details

  • Achieves 19.5 FPS on a single H100 GPU for minute-scale, high-quality video generation.
  • Generates coherent long videos without common anti-drifting techniques.
  • Offers high inference speed without standard acceleration methods.
  • Supports Text-to-Video, Image-to-Video, Video-to-Video, and interactive generation.
  • Provides optimized integrations with Diffusers, vLLM-Omni, and SGLang-Diffusion.
  • Three model variants (Base, Mid, Distilled) offer quality/efficiency trade-offs.

Maintenance & Community

The project benefits from integration efforts by Ascend, HuggingFace (Diffusers), vLLM-Omni, and SGLang-Diffusion. Contact is available via email at shyuan-cs@hotmail.com. No specific community channels like Discord or Slack are listed.

Licensing & Compatibility

Helios is released under the Apache 2.0 license, permitting commercial use and integration into closed-source projects.

Limitations & Caveats

The Helios-Mid model is noted as an intermediate checkpoint that "may not meet expected quality." Image-to-Video and Video-to-Video functionalities might be slightly less performant than Text-to-Video. Performance claims are contingent on specific high-end hardware like the H100.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
26
Star History
1,129 stars in the last 11 days

Explore Similar Projects

Starred by Zhuohan Li Zhuohan Li(Coauthor of vLLM), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
2 more.

FastVideo by hao-ai-lab

0.8%
3k
Framework for accelerated video generation
Created 1 year ago
Updated 1 day ago
Feedback? Help us improve.