TxGNN by mims-harvard

Model for zero-shot therapeutic opportunity prediction

Created 3 years ago

259 stars

Top 97.8% on SourcePulse

Project Summary

TxGNN addresses the challenge of identifying novel therapeutic uses for diseases, particularly those with limited treatment options or understanding. It targets researchers and drug discovery professionals by leveraging geometric deep learning on a comprehensive disease-therapeutic knowledge graph, enabling zero-shot prediction of therapeutic opportunities.

How It Works

TxGNN employs a graph neural network pre-trained on a large knowledge graph encompassing diseases and therapeutic candidates. Its core innovation lies in its zero-shot inference capability, allowing it to predict therapeutic uses for new diseases without requiring task-specific fine-tuning. The model unifies various therapeutic tasks, such as indication and contraindication prediction, within a single framework, utilizing geometric deep learning for complex relational reasoning.

Quick Start & Requirements

Installation requires setting up a Conda environment with Python 3.8, followed by installing PyTorch with a compatible CUDA version. Crucially, DGL version 0.5.2, compiled for the user's specific CUDA version (e.g., dgl-cuda11.8), must be installed via Conda. The TxGNN package is then installed using pip. For certain evaluation splits, PyTorch Geometric (PyG) may also be necessary. An example pre-trained model is available.

Primary Install: Conda environment setup (Python 3.8), PyTorch (with CUDA), conda install -c dglteam dgl-cuda{$CUDA_VERSION}==0.5.2, pip install TxGNN.
Prerequisites: Python 3.8, CUDA (version must match DGL build), DGL 0.5.2.
Links: MedRxiv preprint: https://www.medrxiv.org/content/10.1101/2023.03.19.23287458v2, Explorer: http://txgnn.org.

Highlighted Details

Performs zero-shot inference on new diseases without additional parameters or fine-tuning.
Supports a unified formulation for various therapeutic tasks like indication and contraindication prediction.
Offers multiple data splitting strategies (complex_disease, disease_area, random, disease_eval, full_graph) for flexible evaluation.
Includes functionality for training graph XAI models to explain predictions.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or roadmaps are provided in the README.

Licensing & Compatibility

The license type is not explicitly stated in the provided README content.

Limitations & Caveats

The installation requires a specific version of DGL (0.5.2) and compatible CUDA/PyTorch versions, which could pose setup challenges. While designed for zero-shot prediction, performance on diseases with vastly different biological mechanisms or limited representation in the pre-training graph may vary. The README does not detail known bugs or alpha status.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days