MindVideo  by jqin4749

Research paper for video reconstruction from brain activity

created 2 years ago
382 stars

Top 75.9% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

MinD-Video is a framework for reconstructing high-quality videos from fMRI brain activity data, targeting researchers and engineers in neuroscience and AI. It enables the visualization of visual experiences directly from brain recordings, advancing the understanding of cognitive processes.

How It Works

MinD-Video employs a multi-stage approach: masked brain modeling to learn spatiotemporal patterns from fMRI data, multimodal contrastive learning with spatiotemporal attention for robust feature extraction, and co-training with a Stable Diffusion model enhanced by temporal inflation. This combination allows for high-quality, arbitrary frame rate video generation guided by adversarial principles.

Quick Start & Requirements

  • Install via conda env create -f env.yaml and conda activate mind-video.
  • Requires downloading pre-training datasets (HCP) and target datasets (Wen 2018), along with pre-trained checkpoints.
  • Running generation: python scripts/eval_all.py --config configs/eval_all_sub1.yaml.
  • Recommended: RTX3090 for 2-second, 3 FPS, 256x256 samples; higher specs needed for full frame rate (30 FPS) and resolution.
  • Links: arXiv, Website, Google Drive Samples

Highlighted Details

  • Achieves 85% accuracy in semantic classification and 0.19 SSIM, outperforming prior SOTA by 45%.
  • Demonstrates biological plausibility and interpretability, aligning with physiological processes.
  • Reconstructed videos are of high quality, capturing various objects, animals, motions, and scenes.
  • Can reconstruct videos at full frame rate (30 FPS) and higher resolutions with sufficient GPU memory.

Maintenance & Community

  • Accepted for Oral Presentation at NeurIPS 2023.
  • Codebase is based on Tune-A-Video.

Licensing & Compatibility

  • License not explicitly stated in the README.

Limitations & Caveats

  • Current sample generation is limited by GPU memory (RTX3090 for 2s/3FPS/256x256). Higher resolutions and frame rates require more VRAM.
Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.