Curated list of papers for multi-modal LLMs in 3D
Top 23.8% on sourcepulse
This repository curates resources for Multi-modal Large Language Models (LLMs) applied to 3D tasks, covering understanding, reasoning, generation, and embodied agents. It serves researchers and practitioners in computer vision and AI, providing a structured overview of the rapidly evolving field of 3D-LLM integration.
How It Works
The project functions as a comprehensive, curated list of research papers, categorized by specific 3D tasks. It includes foundational models like CLIP and SAM for broader context. The organization by task and date allows users to track the latest advancements and identify key contributions in applying LLMs to 3D data.
Quick Start & Requirements
This repository is a curated list of research papers and does not have a direct installation or execution command. It requires no specific software to "run" but rather serves as a knowledge base.
Highlighted Details
Maintenance & Community
The repository is actively maintained, with contributions welcome via pull requests. Contact information for maintainers is provided for questions. The project is inspired by the "Awesome-LLM" repository.
Licensing & Compatibility
The repository itself is not software and thus not subject to software licensing. The individual papers and projects linked within are governed by their respective licenses.
Limitations & Caveats
This is a curated list and does not provide code or models directly. The "awesomeness" of listed items is subjective and determined by the maintainers.
1 week ago
1 week