PartCrafter  by wgsxm

Structured 3D mesh generation from single images

Created 10 months ago
2,404 stars

Top 18.6% on SourcePulse

GitHubView on GitHub
Project Summary

Summary PartCrafter addresses structured 3D mesh generation by composing multiple parts and objects from a single RGB image. This NeurIPS 2025 accepted project employs a compositional latent diffusion transformer for one-shot 3D generation, benefiting researchers and developers in computer vision and 3D content creation by enabling detailed, multi-component asset synthesis from 2D inputs.

How It Works The core innovation is its compositional latent diffusion transformer architecture, which jointly generates multiple object parts or entire scenes. This approach enables structured understanding and synthesis of 3D geometry, moving beyond single-object generation to complex assemblies. It leverages latent diffusion for generative power, enhanced by a compositional framework for part-level control and coherence.

Quick Start & Requirements Installation involves cloning the repository and running bash settings/setup.sh. Key dependencies include PyTorch 2.5.1+cu124 and Python 3.11. Graphics libraries like libegl, libglu, pyopengl may be needed. A CUDA-enabled GPU with at least 8GB VRAM is recommended. Official resources include a Project Page, arXiv paper, and a HuggingFace demo.

Highlighted Details

  • Advanced 3D Generation: Synthesizes detailed 3D objects and scenes from single images. Features VLM-based part count suggestion and style transfer for real-world photos.
  • Extensible Provider Architecture: Facilitates integration of new VLM providers for part suggestion and stylization.

Maintenance & Community The project is actively developed, evidenced by its NeurIPS 2025 acceptance and recent open-sourcing. Direct contact is available via email (linyuchen@stu.pku.edu.cn) or GitHub issues. No dedicated community channels or public roadmaps are immediately apparent.

Licensing & Compatibility The repository README does not explicitly state a software license. This absence requires further investigation for commercial use or integration into closed-source projects, as usage restrictions are undefined.

Limitations & Caveats GPU memory usage can be managed by reducing parts or tokens; default settings (1024 tokens/part object, 2048 tokens/part scene) prioritize quality. Training from scratch necessitates downloading external model weights (e.g., TripoSG) and involves separate dataset preprocessing.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
21 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.