med-flamingo  by snap-stanford

Code repo for the Med-Flamingo research paper

Created 2 years ago
434 stars

Top 68.5% on SourcePulse

GitHubView on GitHub
Project Summary

Med-Flamingo provides the codebase for a multimodal medical few-shot learner, enabling rapid adaptation to new medical vision-language tasks with minimal data. It is designed for researchers and practitioners in medical AI and computer vision.

How It Works

Med-Flamingo builds upon the Flamingo architecture, integrating a vision encoder with a large language model (LLM) to process interleaved image and text data. This allows the model to learn from a few examples, making it efficient for specialized medical domains where large annotated datasets are scarce.

Quick Start & Requirements

  • Install dependencies via source install.sh.
  • Requires a GPU with CUDA.
  • Manual download and configuration of Llama-7B (v1) model is recommended. Update tokenizer_config.json with "tokenizer_class": "LlamaTokenizer".
  • Demo script available at scripts/demo.py.

Highlighted Details

  • Codebase for the Med-Flamingo paper.
  • Leverages the Flamingo architecture for few-shot learning.
  • Supports interleaved image and text processing.

Maintenance & Community

No specific community channels or maintenance details are provided in the README.

Licensing & Compatibility

The README does not specify a license. It cites the OpenFlamingo project, which is under a permissive license. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

The project is presented as a research codebase with "More updates to follow soon!", indicating it may be in early development or subject to significant changes. Specific limitations or unsupported features are not detailed.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
2 stars in the last 30 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), Douwe Kiela Douwe Kiela(Cofounder of Contextual AI), and
1 more.

lens by ContextualAI

0.3%
353
Vision-language research paper using LLMs
Created 2 years ago
Updated 1 month ago
Starred by Jiaming Song Jiaming Song(Chief Scientist at Luma AI), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
6 more.

Otter by EvolvingLMMs-Lab

0.0%
3k
Multimodal model for improved instruction following and in-context learning
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.