BiomedGPT  by taokz

Vision-language foundation model for diverse biomedical tasks

Created 2 years ago
688 stars

Top 49.4% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

BiomedGPT is a generalist vision-language foundation model designed for diverse biomedical tasks, targeting researchers and developers in the medical AI domain. It aims to provide a unified framework for tasks like visual question answering, image captioning, and text summarization within the biomedical field.

How It Works

BiomedGPT is built upon the OFA (One-For-All) framework, leveraging a multi-modal and multi-task pre-training approach with extensive biomedical datasets. This strategy allows the model to learn transferable representations across various data modalities and tasks, enabling zero-shot or few-shot performance on downstream applications without task-specific architectures.

Quick Start & Requirements

  • Installation: Clone the repository and install dependencies using conda create --name biomedgpt python=3.7.4, python -m pip install pip==21.2.4, and pip install -r requirements.txt.
  • Prerequisites: Linux environment, Python 3.7.4.
  • Checkpoints: Pretrained checkpoints are available via Dropbox. Transformers-compatible weights are accessible through a Hugging Face collection.
  • Resources: A Colab notebook is provided for inference.
  • Documentation: Details on datasets and fine-tuned checkpoints are available in datasets.md and checkpoints.md respectively.

Highlighted Details

  • Generalist vision-language foundation model for biomedical tasks.
  • Pre-trained and fine-tuned on multi-modal & multi-task biomedical datasets.
  • Supports zero-shot inference for tasks like Visual Question Answering (VQA).
  • Includes scripts for pretraining, fine-tuning, and inference across VQA, image captioning, text summarization, natural language inference, and image classification.

Maintenance & Community

The project is associated with a Nature Medicine 2024 publication. Further questions can be directed to the authors or via GitHub issues.

Licensing & Compatibility

BiomedGPT is strictly for academic research purposes. Commercial and clinical uses are prohibited due to the inherited non-commercial license from the OFA framework, lack of healthcare setting licensing, and insufficient security measures for medical diagnoses.

Limitations & Caveats

The current implementation is not designed for chatbot or copilot applications, with ongoing work for improved conversational abilities. Extensive experiments with Huggingface's transformers have not been conducted, and full alignment with Fairseq results is not guaranteed.

Health Check
Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
12 stars in the last 30 days

Explore Similar Projects

Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
1 more.

METER by zdou0830

0%
373
Multimodal framework for vision-and-language transformer research
Created 3 years ago
Updated 2 years ago
Feedback? Help us improve.