Research paper for reconstructing images from human brain activity using latent diffusion
Top 34.9% on sourcepulse
This repository provides code for reconstructing visual experiences from human brain activity using Stable Diffusion, extending prior work with advanced decoding techniques like text prompts and GANs. It targets researchers and engineers in neuroscience, computer vision, and AI interested in brain-computer interfaces and generative models. The primary benefit is enabling high-resolution visual reconstruction from neural data.
How It Works
The project leverages latent diffusion models (Stable Diffusion v1.4 and v2.0) to generate images based on decoded features extracted from fMRI data. It employs various decoding strategies, including direct feature mapping, text prompt generation via BLIP, and feature extraction from GANs (VGG19), to improve reconstruction accuracy. The approach maps brain activity to intermediate representations within the diffusion model or to descriptive text, guiding the image generation process.
Quick Start & Requirements
sd-v1-4.ckpt
, 512-depth-ema.ckpt
), and pre-trained models (VGG_ILSVRC_19_layers, bvlc_reference_caffenet_generator_ILSVRC2012_Training).Highlighted Details
Maintenance & Community
The project is associated with the CVPR 2023 paper "High-resolution image reconstruction with latent diffusion models from human brain activity" by Yu Takagi and Shinji Nishimoto. It acknowledges several key repositories it builds upon, including Stable Diffusion, BLIP, and bdpy. Contact information is provided via email.
Licensing & Compatibility
The repository itself does not explicitly state a license. However, it builds upon and requires Stable Diffusion, which is typically released under permissive licenses (e.g., CreativeML Open RAIL-M). Compatibility for commercial use would depend on the licenses of the underlying models and datasets used.
Limitations & Caveats
The setup process is complex, requiring significant data downloads and multiple environment configurations. The README notes that updating transformers
might break BLIP functionality, suggesting careful environment management. The project relies heavily on specific versions of Stable Diffusion and pre-trained models.
1 year ago
Inactive