Library for language-vision AI research
Top 4.8% on sourcepulse
LAVIS is a comprehensive Python library for language-vision intelligence research and applications, offering a unified interface for over 10 tasks, 20 datasets, and 30 state-of-the-art models. It empowers researchers and engineers to rapidly develop, benchmark, and deploy multimodal AI solutions, from image captioning and visual question answering to multimodal feature extraction.
How It Works
LAVIS employs a modular design, providing a unified interface to easily access, repurpose, and extend existing modules like datasets, models, and preprocessors. It supports off-the-shelf inference with readily available pre-trained models and includes automatic download tools for numerous language-vision datasets, simplifying data preparation and model training/evaluation.
Quick Start & Requirements
pip install salesforce-lavis
git clone https://github.com/salesforce/LAVIS.git && cd LAVIS && pip install -e .
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The library may exhibit socioeconomic biases present in the training data, potentially leading to misclassifications or offensive outputs. Users are advised to review models for responsible use.
8 months ago
Inactive