syncity  by paulengstler

3D world generator from text prompts, no training

created 4 months ago
621 stars

Top 54.1% on sourcepulse

GitHubView on GitHub
Project Summary

SynCity generates complex, navigable 3D worlds from text prompts without requiring any training or optimization. It targets researchers and developers interested in procedural content generation for virtual environments, offering a novel approach to creating detailed scenes by combining pre-trained 2D and 3D generative models.

How It Works

SynCity employs a tile-by-tile generation process. It first uses the Flux 2D generator for artistic consistency and then the TRELLIS 3D generator for accurate geometry. Each tile is generated as a 2D image, ensuring context from adjacent tiles for coherence. These 2D tiles are then converted into 3D models, and adjacent tiles are seamlessly blended to form a complete, navigable environment. This method leverages existing powerful generative models without the need for custom training.

Quick Start & Requirements

  • Installation: Clone the repository and run source ./setup.sh --new-env --basic --xformers --diffoctreerast --spconv --mipgaussian --kaolin --nvdiffrast. Requires CUDA_HOME environment variable.
  • Prerequisites: Ubuntu 22.04 (or similar Linux), NVIDIA GPU with >= 48GB memory (A40/A6000 tested), CUDA 11.8/12.4, Python 3.10, Conda, Blender 3.6.19. Requires HuggingFace account and agreement to FLUX.1-dev terms.
  • Usage: Start an inpainting server (./inpainting_server.sh --run), then run python run_pipeline.py for tile generation and python blend_gaussians.py for blending.
  • Links: TRELLIS repository (for additional setup guidance).

Highlighted Details

  • Training-free generation of 3D worlds from text.
  • Incremental tile-by-tile scene construction.
  • Leverages Flux (2D) and TRELLIS (3D) generative models.
  • Seamless blending of generated tiles for coherent environments.

Maintenance & Community

The project is associated with the Visual Geometry Group at the University of Oxford. Further community or maintenance details are not explicitly provided in the README.

Licensing & Compatibility

The project's licensing is not specified in the README. Compatibility for commercial use or closed-source linking is not detailed.

Limitations & Caveats

Requires substantial GPU memory (48GB+). The setup process involves multiple external dependencies and model agreements. Prompt engineering is crucial for optimal results, with specific guidance provided for better world generation.

Health Check
Last commit

3 months ago

Responsiveness

1+ week

Pull Requests (30d)
0
Issues (30d)
1
Star History
45 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
7 more.

stable-dreamfusion by ashawkey

0.1%
9k
Text-to-3D model using NeRF and diffusion
created 2 years ago
updated 1 year ago
Feedback? Help us improve.