CE3D  by Fangkang515

3D scene editor for interactive manipulation via LLM-driven chat

created 1 year ago
320 stars

Top 86.0% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides an interactive 3D scene editing framework, CE3D, that leverages large language models (LLMs) and a suite of over 20 visual models to enable users to modify 3D scenes using text prompts. It is designed for researchers and developers working with 3D content creation and manipulation, offering a ChatGPT-like interface for intuitive scene editing.

How It Works

CE3D integrates multiple specialized AI models for tasks such as segmentation, image captioning, text-to-image generation, visual question answering, and depth estimation. These models work in concert, orchestrated by an LLM, to interpret user text commands and apply corresponding edits to a 3D scene. The framework supports a modular approach, allowing for flexible configuration of the required visual models based on available hardware resources.

Quick Start & Requirements

  • Install: pip install -r requirements.txt and specific dependencies like tiny-cuda-nn.
  • Prerequisites: Python 3.10, CUDA, OpenAI API key. Requires significant GPU memory (e.g., 100GB for full version, 20GB for a limited version).
  • Run: export OPENAI_API_KEY={Your_Private_Openai_Key} followed by make run-all or make run-small-instruct.
  • Links: Project Page, Paper, Video, Demo.

Highlighted Details

  • Integrates over 20 visual models for comprehensive 3D scene editing.
  • Offers a ChatGPT-like conversational interface for intuitive interaction.
  • Supports both high-resource (100GB GPU) and limited-resource (20GB GPU) configurations.
  • CE3D++ framework is under development, aiming to support 4D scenes and local LLMs.

Maintenance & Community

The project is associated with ECCV2024 and lists authors from various institutions. Further updates and planned features are indicated by "heartbeat" emojis in the README, suggesting active development.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Users should verify licensing for commercial use or integration into closed-source projects.

Limitations & Caveats

The full functionality requires substantial GPU resources (100GB). While a smaller version exists, its editing capabilities are limited. The project is actively being developed, with plans for 4D scene support and local LLM integration, indicating potential for ongoing changes and API evolution.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
15 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Didier Lopes Didier Lopes(Founder of OpenBB), and
10 more.

JARVIS by microsoft

0.1%
24k
System for LLM-orchestrated AI task automation
created 2 years ago
updated 4 days ago
Feedback? Help us improve.