Helios  by PKU-YuanGroup

Breakthrough in real-time long video generation

Created 1 month ago
1,740 stars

Top 24.1% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Helios is a 14B parameter model for real-time, long-form video generation, offering superior efficiency and quality compared to smaller models. It targets researchers and developers needing high-quality, minute-scale video synthesis at high frame rates, providing a significant performance boost for generative AI.

How It Works

Helios generates video autoregressively in 33-frame chunks, achieving high temporal coherence for minute-long videos without conventional anti-drifting strategies (e.g., self-forcing, keyframe sampling). It also bypasses standard acceleration techniques like KV-caching or quantization, yet delivers 19.5 FPS on a single H100 GPU. This design prioritizes end-to-end inference efficiency and reduced memory, enabling larger training batches and fitting multiple models within limited VRAM.

Quick Start & Requirements

Installation requires cloning the repo, setting up a Python 3.11.2 conda environment, and installing PyTorch with specific CUDA versions (11.8, 12.6, or 12.8). Dependencies are installed via bash install.sh. High-performance inference is demonstrated on a single NVIDIA H100 GPU. Integrations with Diffusers, vLLM-Omni, and SGLang-Diffusion require installation from their respective GitHub repositories.

Highlighted Details

  • Achieves 19.5 FPS on a single H100 GPU for minute-scale, high-quality video generation.
  • Generates coherent long videos without common anti-drifting techniques.
  • Offers high inference speed without standard acceleration methods.
  • Supports Text-to-Video, Image-to-Video, Video-to-Video, and interactive generation.
  • Provides optimized integrations with Diffusers, vLLM-Omni, and SGLang-Diffusion.
  • Three model variants (Base, Mid, Distilled) offer quality/efficiency trade-offs.

Maintenance & Community

The project benefits from integration efforts by Ascend, HuggingFace (Diffusers), vLLM-Omni, and SGLang-Diffusion. Contact is available via email at shyuan-cs@hotmail.com. No specific community channels like Discord or Slack are listed.

Licensing & Compatibility

Helios is released under the Apache 2.0 license, permitting commercial use and integration into closed-source projects.

Limitations & Caveats

The Helios-Mid model is noted as an intermediate checkpoint that "may not meet expected quality." Image-to-Video and Video-to-Video functionalities might be slightly less performant than Text-to-Video. Performance claims are contingent on specific high-end hardware like the H100.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
22
Star History
199 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.