Discover and explore top open-source AI tools and projects—updated daily.
zzyunzhiProgrammatic 3D scene generation from language and images
Top 100.0% on SourcePulse
This project introduces the "Scene Language," a novel approach to representing 3D scenes using a combination of programs, natural language, and embeddings. Aimed at researchers and developers in computer vision and graphics, it enables sophisticated text- and image-conditioned 3D scene generation, offering a powerful tool for creating and manipulating complex virtual environments.
How It Works
The core innovation lies in translating high-level scene descriptions into executable programs. It leverages large language models (LLMs), recommending Claude 3.7 Sonnet, to interpret prompts and generate scene representations. These representations can then be rendered using various engines, including Mitsuba for photorealistic outputs and Minecraft for block-based environments, facilitating a flexible pipeline from concept to 3D scene.
Quick Start & Requirements
Installation involves creating a Conda environment (python=3.12), cloning the repository, and installing the package with pip install -e .. The Minecraft renderer requires spacy and the en_core_web_md model. Users need an Anthropic API key, configured in engine/key.py, for LLM access. Links to arXiv and a project page are mentioned but not directly provided. A download link for example results, including prompts and LLM responses, is available.
Highlighted Details
.ply meshes.Maintenance & Community
The project encourages users to report issues by opening GitHub issues or contacting the developers via email. No specific community channels (e.g., Discord, Slack) or roadmap details are provided in the README.
Licensing & Compatibility
The provided README text does not specify a software license. This omission requires further investigation to determine usage rights, particularly for commercial applications or integration into closed-source projects.
Limitations & Caveats
The generation pipeline is noted to be sensitive to minor prompt variations, suggesting that users should experiment with prompt phrasing for optimal results. Certain tasks and renderers featured in the associated paper are marked as "coming soon," indicating that the current codebase may not encompass the full scope of the research.
6 months ago
1 day
openai