MotionStreamer by zju3dv

Streaming motion generation from text

Created 11 months ago

254 stars

Top 99.0% on SourcePulse

Project Summary

MotionStreamer addresses the challenge of real-time, streaming motion generation by introducing a diffusion-based autoregressive model operating within a causal latent space. Aimed at researchers and engineers in computer vision and animation, this project enables efficient generation of complex human motion sequences from text descriptions, building upon a novel 272-dimensional motion representation.

How It Works

The core innovation lies in a diffusion-based autoregressive model designed for causal latent spaces, enabling sequential motion generation. It leverages a specialized 272-dimensional motion representation and requires training intermediate components like a Causal Text-to-Motion Representation (TAE) model. The approach processes data in a custom streaming format derived from existing datasets like BABEL, facilitating continuous motion synthesis.

Quick Start & Requirements

Installation requires creating a Conda environment from environment.yaml and activating it. Prerequisites include Python and Conda. Extensive data preparation involves downloading processed 272-dim motion representations for HumanML3D and BABEL datasets via huggingface-cli download; this data is for academic use only. Training is multi-stage, requiring multiple GPUs for evaluators, Causal TAEs, text-to-motion models, and MotionStreamer. Links to datasets and checkpoints are available on Hugging Face.

Highlighted Details

Accepted to ICCV 2025.
Utilizes a novel 272-dimensional motion representation.
Features a diffusion-based autoregressive model for streaming motion generation.
Provides pre-trained checkpoints and demo inference scripts.

Maintenance & Community

The project appears to be a research output with an extensive list of academic authors. No specific community channels (e.g., Discord, Slack) or explicit roadmap links are provided in the README.

Licensing & Compatibility

The processed datasets (HumanML3D, BABEL) are explicitly stated as "solely for academic purposes." The README also directs users to read the AMASS License, suggesting potential restrictions on data usage. The software license for the code itself is not specified.

Limitations & Caveats

The README lists "complete code for MotionStreamer" as a TODO, indicating potential incompleteness. Processed data is restricted to academic use, limiting commercial applications. Setup and training are complex, demanding multiple GPUs and detailed data preparation. A clear software license for the code is not provided.

MotionStreamer by zju3dv

Explore Similar Projects

HumanTOMATO by IDEA-Research

FlowMDM by BarqueroGerman

MotionMillion-Codes by VankouF

MotionLLM by IDEA-Research

awesome-conditional-content-generation by haofanwang

Tora by alibaba

T2M-GPT by Mael-zys

tinyworlds by AlmondGod

MotionGPT by OpenMotionLab

MAGI-1 by SandAI-org

motion-diffusion-model by GuyTevet

SkyReels-V2 by SkyworkAI