CE3D by Fangkang515

3D scene editor for interactive manipulation via LLM-driven chat

Created 1 year ago

312 stars

Top 86.5% on SourcePulse

Project Summary

This project provides an interactive 3D scene editing framework, CE3D, that leverages large language models (LLMs) and a suite of over 20 visual models to enable users to modify 3D scenes using text prompts. It is designed for researchers and developers working with 3D content creation and manipulation, offering a ChatGPT-like interface for intuitive scene editing.

How It Works

CE3D integrates multiple specialized AI models for tasks such as segmentation, image captioning, text-to-image generation, visual question answering, and depth estimation. These models work in concert, orchestrated by an LLM, to interpret user text commands and apply corresponding edits to a 3D scene. The framework supports a modular approach, allowing for flexible configuration of the required visual models based on available hardware resources.

Quick Start & Requirements

Install: pip install -r requirements.txt and specific dependencies like tiny-cuda-nn.
Prerequisites: Python 3.10, CUDA, OpenAI API key. Requires significant GPU memory (e.g., 100GB for full version, 20GB for a limited version).
Run: export OPENAI_API_KEY={Your_Private_Openai_Key} followed by make run-all or make run-small-instruct.
Links: Project Page, Paper, Video, Demo.

Highlighted Details

Integrates over 20 visual models for comprehensive 3D scene editing.
Offers a ChatGPT-like conversational interface for intuitive interaction.
Supports both high-resource (100GB GPU) and limited-resource (20GB GPU) configurations.
CE3D++ framework is under development, aiming to support 4D scenes and local LLMs.

Maintenance & Community

The project is associated with ECCV2024 and lists authors from various institutions. Further updates and planned features are indicated by "heartbeat" emojis in the README, suggesting active development.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

The full functionality requires substantial GPU resources (100GB). While a smaller version exists, its editing capabilities are limited. The project is actively being developed, with plans for 4D scene support and local LLM integration, indicating potential for ongoing changes and API evolution.

CE3D by Fangkang515

Explore Similar Projects

LL3DA by Open3DA

Cap3D by crockwell

Keye by Kwai-Keye

LLaVA-3D by ZCMax

autovfx by haoyuhsu

LayoutGPT by weixi-feng

omini-kontext by Saquib764

3D-LLM by UMass-Embodied-AGI

open-pose-editor by ZhUyU1997

ml-mgie by apple

disco-diffusion by alembics

Grounded-Segment-Anything by IDEA-Research