Discover and explore top open-source AI tools and projects—updated daily.
zju3dvStreaming motion generation from text
Top 99.0% on SourcePulse
MotionStreamer addresses the challenge of real-time, streaming motion generation by introducing a diffusion-based autoregressive model operating within a causal latent space. Aimed at researchers and engineers in computer vision and animation, this project enables efficient generation of complex human motion sequences from text descriptions, building upon a novel 272-dimensional motion representation.
How It Works
The core innovation lies in a diffusion-based autoregressive model designed for causal latent spaces, enabling sequential motion generation. It leverages a specialized 272-dimensional motion representation and requires training intermediate components like a Causal Text-to-Motion Representation (TAE) model. The approach processes data in a custom streaming format derived from existing datasets like BABEL, facilitating continuous motion synthesis.
Quick Start & Requirements
Installation requires creating a Conda environment from environment.yaml and activating it. Prerequisites include Python and Conda. Extensive data preparation involves downloading processed 272-dim motion representations for HumanML3D and BABEL datasets via huggingface-cli download; this data is for academic use only. Training is multi-stage, requiring multiple GPUs for evaluators, Causal TAEs, text-to-motion models, and MotionStreamer. Links to datasets and checkpoints are available on Hugging Face.
Highlighted Details
Maintenance & Community
The project appears to be a research output with an extensive list of academic authors. No specific community channels (e.g., Discord, Slack) or explicit roadmap links are provided in the README.
Licensing & Compatibility
The processed datasets (HumanML3D, BABEL) are explicitly stated as "solely for academic purposes." The README also directs users to read the AMASS License, suggesting potential restrictions on data usage. The software license for the code itself is not specified.
Limitations & Caveats
The README lists "complete code for MotionStreamer" as a TODO, indicating potential incompleteness. Processed data is restricted to academic use, limiting commercial applications. Setup and training are complex, demanding multiple GPUs and detailed data preparation. A clear software license for the code is not provided.
4 months ago
Inactive
GuyTevet
SkyworkAI