Discover and explore top open-source AI tools and projects—updated daily.
minghanqinLanguage-guided 3D scene synthesis and manipulation
Top 40.0% on SourcePulse
Summary
LangSplat offers the official implementation for "3D Language Gaussian Splatting," enabling semantic interaction and querying of 3D scenes via natural language. It targets researchers and practitioners in computer vision, providing a novel method for language-guided scene comprehension and manipulation with efficient rendering capabilities.
How It Works
The system utilizes a PyTorch-based optimizer with CUDA extensions to generate 3D Gaussian Splatting models from SfM datasets incorporating language features. A key innovation is a scene-wise language autoencoder that drastically reduces memory demands by learning compressed language representations. Tools are provided to convert custom image datasets into an optimization-ready SfM format, complete with language features, facilitating broader application.
Quick Start & Requirements
Installation requires cloning the repository recursively (git clone --recursive) and setting up a Conda environment (conda env create --file environment.yml, conda activate langsplat). Hardware needs include a CUDA-ready GPU (Compute Capability 7.0+) and 24 GB VRAM for high-quality training. Software prerequisites are Conda, a C++ compiler, and CUDA SDK 11 (e.g., 11.8). Preprocessed datasets and pre-trained models are available via BaiduWangpan and GoogleDrive. Links to the paper, video, and webpage are provided.
Highlighted Details
Maintenance & Community
The project is explicitly "under development," encouraging contributions via GitHub issues and pull requests. A TODO list indicates ongoing work, with some items marked "coming soon." No specific community channels (e.g., Discord, Slack) are listed.
Licensing & Compatibility
The provided README content does not specify a software license. This omission requires clarification for assessing commercial use or integration into closed-source projects.
Limitations & Caveats
The "under development" status implies potential for ongoing changes and undiscovered issues. High-quality training demands significant hardware, specifically 24 GB VRAM. Custom scene processing requires strict adherence to dataset structure and COLMAP data availability. The absence of a stated license is a notable adoption caveat.
3 weeks ago
Inactive
openai