Video understanding codebase using PyTorch
Top 68.5% on sourcepulse
This repository provides a PyTorch-based codebase for state-of-the-art video understanding tasks, targeting researchers and engineers. It simplifies the implementation and evaluation of various video classification models, offering a high-performance, modular design for rapid experimentation with novel research ideas.
How It Works
X-Temporal supports multiple input formats, including raw videos, RGB frames, and optical flow frames, and is designed to handle both single-label and multi-label datasets. Its modular architecture allows for easy integration and comparison of popular video understanding frameworks like SlowFast, R(2+1)D, R3D, TSN, and TSM.
Quick Start & Requirements
./easy_setup.sh
.Highlighted Details
Maintenance & Community
Maintained by Hao Shao, ManYuan Zhang, and Yu Liu. The project was released in August 2020.
Licensing & Compatibility
Released under the MIT license, permitting commercial use and integration with closed-source projects.
Limitations & Caveats
The codebase was last updated in August 2020, and its compatibility with the latest PyTorch versions or newer SOTA models is not guaranteed.
4 years ago
1 week