LAMM  by OpenGVLab

Framework for multi-modal large language model (MLLM) training and evaluation

created 2 years ago
316 stars

Top 86.7% on sourcepulse

GitHubView on GitHub
Project Summary

LAMM provides a framework and dataset for training and evaluating Multi-modal Large Language Models (MLLMs), enabling the development of AI agents that bridge human ideas and machine execution. It targets researchers and developers seeking to build and test sophisticated multi-modal AI systems.

How It Works

LAMM focuses on language-assisted multi-modal instruction tuning, allowing models to understand and respond to complex instructions involving both text and visual inputs. The framework supports 2D and 3D tasks, facilitating the creation of agents capable of diverse applications, from image quality assessment to embodied AI in simulated environments like Minecraft.

Quick Start & Requirements

  • Install: Refer to the tutorial for basic usage.
  • Requirements: Light training framework available for V100 or RTX3090 GPUs. LLaMA2-based finetuning is supported.
  • Resources: Checkpoints and evaluation code are available on Huggingface.
  • Links: Project Page, Demo Video (YouTube/Bilibili), Full Paper, LAMM Dataset (Huggingface/OpenDataLab).

Highlighted Details

  • Accepted by NeurIPS 2023 Datasets & Benchmark Track.
  • Includes comprehensive evaluation frameworks like ChEF.
  • Supports advanced MLLM research with projects like Octavius (mitigating task interference) and MP5 (embodied AI).
  • Features datasets like DepictQA for image quality assessment.

Maintenance & Community

The project is actively updated with new research preprints and framework releases, including Ch3Ef and DepictQA. Checkpoints and leaderboards are maintained on Huggingface.

Licensing & Compatibility

The project is licensed under CC BY NC 4.0, strictly limiting use to non-commercial purposes. Models trained with the dataset are also restricted to research use only.

Limitations & Caveats

The CC BY NC 4.0 license and the restriction on using trained models outside research purposes significantly limit commercial adoption and integration into proprietary systems.

Health Check
Last commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.