handwritingBCI by fwillett

Code for brain-to-text communication via handwriting research paper

Created 4 years ago

379 stars

Top 75.3% on SourcePulse

Project Summary

This repository provides code for reproducing high-performance neural decoding of attempted handwriting movements, enabling brain-to-text communication. It is intended for researchers and engineers in the BCI and neuroscience fields. The project offers a complete pipeline from raw neural data to text output, with significant performance gains achieved through language model integration.

How It Works

The pipeline decodes neural data into character sequences using a Recurrent Neural Network (RNN), specifically GRU layers, which are well-suited for sequential data. It incorporates a multi-stage language modeling approach, starting with a bigram model and progressing to a GPT-2 rescoring step. This layered approach refines the decoded output, significantly reducing character and word error rates by leveraging linguistic context.

Quick Start & Requirements

Install: Requires Python >= 3.6, TensorFlow 1.15, NumPy (1.17), SciPy (1.1.0), and scikit-learn (0.20).
GPU: A GPU with cuDNN is required for RNN training and inference (Step 4-5).
Dependencies: Kaldi for bigram language modeling and GPT-2 model files (1558M version) for rescoring.
Data: Download associated dataset and intermediate results. Note that Step 3 produces ~100 GB of files and must be run separately.
Documentation: Jupyter notebooks detail each step of the process.

Highlighted Details

Achieves character error rates as low as 0.34% and word error rates of 1.97% with GPT-2 rescoring on held-out trials.
Demonstrates improved generalization to unseen neural activity through artificial firing rate drifts for the more challenging 'HeldOutBlocks' partition.
Offers two distinct train/test partitions ('HeldOutTrials' and 'HeldOutBlocks') to evaluate model robustness.
Includes HMM-based labeling of neural data as a preliminary step.

Maintenance & Community

The project is associated with a specific manuscript and preprint. No information on community channels or ongoing maintenance is provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project relies on specific, older versions of dependencies (TensorFlow 1.15), which may pose compatibility challenges with modern environments. The substantial data generation (~100 GB) and external dependencies like Kaldi and GPT-2 models add complexity to setup and reproduction.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days