MS-G3D  by kenziyuliu

PyTorch code for skeleton-based action recognition research paper

Created 5 years ago
453 stars

Top 66.5% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a PyTorch implementation for skeleton-based action recognition, addressing the need for effective graph convolutional networks. It targets researchers and practitioners in computer vision and human-computer interaction, offering a unified framework for disentangling graph convolution operations to improve performance.

How It Works

The project implements a novel approach to graph convolutions for skeleton-based action recognition, disentangling spatial and temporal graph learning. This allows for more flexible and powerful modeling of human actions by separating the learning of relationships between body joints (spatial) from the learning of how these relationships evolve over time (temporal). This disentanglement is key to achieving state-of-the-art results.

Quick Start & Requirements

  • Install: pip install -r requirements.txt (after cloning)
  • Prerequisites: Python >= 3.6, PyTorch >= 1.2.0, NVIDIA Apex, PyYAML, tqdm, tensorboardX.
  • Data: Requires downloading NTU RGB+D 60/120 and Kinetics 400 datasets, totaling ~38GB-77GB after preprocessing. Data generation can take several hours.
  • Resources: Mixed precision training (--half) is recommended for GPUs with ~11GB memory.
  • Links: PDF, Demo

Highlighted Details

  • CVPR 2020 Oral paper.
  • Supports joint-bone two-stream fusion for enhanced accuracy.
  • Offers pretrained models for NTU RGB+D 60/120 and Kinetics 400.
  • Mixed precision training with NVIDIA Apex is supported for memory efficiency.

Maintenance & Community

  • Based on 2s-AGCN and ST-GCN projects.
  • Contact: kenziyuliu AT outlook.com

Licensing & Compatibility

  • The README does not explicitly state a license. However, the project is based on other repositories which may have their own licenses. Users should verify licensing for commercial use.

Limitations & Caveats

Memory usage can be sensitive to PyTorch/CUDA versions and GPU setups, potentially leading to OOM errors. Using Apex O2 mode may require single-GPU training due to nn.DataParallel incompatibility. The best fusion results might not always come from the top-performing individual stream models.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.