prismatic-vlms  by TRI-ML

VLM codebase for training visually-conditioned language models

created 1 year ago
750 stars

Top 47.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a flexible and efficient codebase for training visually-conditioned language models (VLMs). It targets researchers and practitioners looking to experiment with or deploy VLMs, offering support for diverse visual backbones, language models, and easy scaling for large parameter models.

How It Works

Prismatic VLMs supports multiple visual backbones (CLIP, SigLIP, DINOv2, and fusions) via TIMM integration, and arbitrary AutoModelForCausalLM instances from Hugging Face Transformers. Training leverages PyTorch FSDP and Flash-Attention for efficient scaling from 1B to 34B parameters.

Quick Start & Requirements

  • Install via pip install -e . after cloning.
  • Requires Python >= 3.8, PyTorch >= 2.1, and Flash-Attention 2.
  • Official quick-start and usage examples are provided in the README.

Highlighted Details

  • Supports a wide range of visual backbones and language models.
  • Enables efficient training of large-scale VLMs (1B-34B parameters).
  • Offers a comprehensive evaluation codebase for VLMs.
  • Provides 49 pretrained VLM models with detailed descriptions.

Maintenance & Community

The project is actively maintained by TRI-ML. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The code is released under the MIT License. Pretrained models inherit licenses from their base datasets and LMs (e.g., Llama Community License for Llama-2 derived models, Apache/MIT for Mistral/Phi-2 derived models). Commercial use is permitted for models adhering to compatible licenses.

Limitations & Caveats

Pretrained models may have licensing restrictions inherited from their training data and base language models. Users must ensure compliance with these underlying licenses.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
91 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.