PyTorch code for text-to-motion generation research paper
Top 50.0% on sourcepulse
T2M-GPT provides a PyTorch implementation for generating human motion from textual descriptions, based on the CVPR 2023 paper "T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations." It targets researchers and developers in animation, robotics, and AI who need to synthesize realistic human movements from natural language prompts. The project offers a novel approach to motion generation by leveraging discrete representations, aiming for improved quality and controllability.
How It Works
The core of T2M-GPT involves a two-stage process. First, a VQ-VAE (Vector Quantized Variational Autoencoder) learns to discretize human motion sequences into a set of learned codes. Second, a GPT (Generative Pre-trained Transformer) model is trained on these discrete motion codes, conditioned on textual descriptions, to generate new motion sequences. This discrete representation approach allows the GPT to model motion as a sequence of tokens, similar to language, enabling more effective generation and potentially better handling of long-range dependencies.
Quick Start & Requirements
conda env create -f environment.yml
and conda activate T2M-GPT
.osmesa
, shapely
, pyrender
, trimesh
.Highlighted Details
Maintenance & Community
The project was released in February 2023. There is a Hugging Face Space demo available. Links to community channels are not explicitly provided in the README.
Licensing & Compatibility
The repository does not explicitly state a license. The code is presented as a PyTorch implementation for research purposes. Compatibility with commercial or closed-source applications is not specified.
Limitations & Caveats
The implementation requires a specific older version of PyTorch (1.8.1) and Python 3.8, which may pose compatibility challenges with newer environments. The evaluation process is noted to be time-consuming due to the need to generate multiple motions per text prompt.
10 months ago
1 week