EgoX by DAVIAN-Robotics

Egocentric video generation from exocentric input

Created 6 months ago

719 stars

Top 47.3% on SourcePulse

Project Summary

EgoX is a novel framework for generating egocentric (first-person) videos from a single exocentric (third-person) video input. It addresses realistic viewpoint transformation while maintaining temporal consistency and scene structure. Designed for researchers in egocentric video synthesis, EgoX offers a powerful tool for creating immersive first-person perspectives by leveraging external observations and egocentric priors.

How It Works

The framework builds upon large-scale video diffusion models trained on the Ego-Exo4D dataset. EgoX employs a unified conditioning strategy integrating spatial and channel information within latent representations for realistic viewpoint transformation. A key advantage is its lightweight adaptation mechanism using LoRA-based fine-tuning, significantly reducing customization computational burden.

Quick Start & Requirements

Installation: Requires Python 3.10, CUDA 12.1+, and compatible PyTorch. Installation involves conda environment setup, PyTorch installation, and pip installing dependencies.
Hardware: Substantial GPU VRAM is mandatory: ≥ 80GB for inference, ≥ 140GB for training.
Model Weights: Pretrained Wan2.1-I2V-14B and EgoX LoRA weights must be downloaded from Hugging Face/Google Drive.
Inference: Quick testing uses example data via shell scripts (scripts/infer_itw.sh, scripts/infer_ego4d.sh). Custom data inference requires specific directory structures and metadata preparation.
Links: Teaser Video: `https://github.com/user

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

5

Star History

20 stars in the last 30 days

Explore Similar Projects

RynnEC by alibaba-damo-academy

Video MLLM for embodied cognition

Created 10 months ago

Updated 7 months ago

vid2vid-zero by baaivision

Video editing research paper using image diffusion

Created 3 years ago

Updated 2 years ago

Starred by

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI).

Magic-Me by Zhen-Dong

Video diffusion for personalized clips

Created 2 years ago

Updated 2 years ago

MotionClone by LPengYang

Training-free framework for controllable video generation

Created 2 years ago

Updated 1 year ago

Starred by

Shizhe Diao

Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA).

FateZero by ChenyangQiQi

Zero-shot video editor (ICCV 2023 Oral) using attention fusion

Created 3 years ago

Updated 2 years ago

Awesome-Video-Diffusion-Models by ChenHsing

Survey on video diffusion models

Created 2 years ago

Updated 1 month ago

Starred by

Aravind Srinivas

Aravind Srinivas(Cofounder of Perplexity) and

Chenlin Meng

Chenlin Meng(Cofounder of Pika).

VideoGPT by wilson1yan

Video generation research paper using VQ-VAE and Transformers

Created 5 years ago

Updated 1 year ago

VideoX-Fun by aigc-apps

Flexible framework for advanced AI video generation

Created 1 year ago

Updated 1 day ago

Rerender_A_Video by williamyang1991

Video-to-video translation framework for zero-shot text-guided video rendering

Created 3 years ago

Updated 2 years ago

Starred by

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI).

MAGI-1 by SandAI-org

Video generation model using autoregressive chunk-wise denoising

Created 1 year ago

Updated 1 year ago

Starred by

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI) and

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

VGen by ali-vilab

Video synthesis codebase for state-of-the-art generative models

Created 2 years ago

Updated 1 year ago

Starred by

Alex Yu

Alex Yu(Research Scientist at OpenAI; Cofounder of Luma AI),

Jiaming Song

Jiaming Song(Chief Scientist at Luma AI), and

1 more.

SkyReels-V2 by SkyworkAI

Film generation model for infinite-length videos using diffusion forcing

Created 1 year ago

Updated 4 months ago

Feedback? Help us improve.