Discover and explore top open-source AI tools and projects—updated daily.
Fugtemypt123Vision-as-Inverse-Graphics agent for programmatic visual reconstruction
Top 43.3% on SourcePulse
Summary
VIGA (Vision-as-Inverse-Graphics Agent) is a programmatic visual reconstruction agent for complex scene generation and editing. It targets researchers and power users, employing an analysis-by-synthesis approach for iterative refinement without finetuning. The agent's self-correcting loop generates, renders, and verifies scenes against targets, offering a robust solution for programmatic visual tasks.
How It Works
VIGA functions as a self-reflective agent alternating between Generator and Verifier roles. The Generator writes and executes scene programs using tools for planning, code execution, asset retrieval, and scene queries. The Verifier renders scenes from multiple viewpoints, identifies discrepancies, and provides feedback for revision. This iterative write-run-compare-revise loop is self-correcting and requires no finetuning, maintaining an evolving contextual memory.
Quick Start & Requirements
Installation requires Conda; an NVIDIA GPU with CUDA is recommended for 3D modes.
git clone https://github.com/Fugtemypt123/VIGA-release.git && cd VIGA-releasewget -P utils/third_party/sam https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pthagent (Python 3.10), blender (Python 3.11), sam (Python 3.10), and sam3d (Python 3.11) with specified requirements.utils/_api_keys.py with OPENAI_API_KEY and MESHY_API_KEY.utils/_path.py to set CONDA_BASE.conda activate agent then python runners/dynamic_scene.py --task=artist --model=gpt-5.
Highlighted Details
Maintenance & Community
The README provides no details on contributors, sponsorships, community channels, or a public roadmap.
Licensing & Compatibility
The README omits license information, preventing an assessment of compatibility for commercial use or closed-source linking.
Limitations & Caveats
Setup is complex, requiring multiple Conda environments and specific dependencies. Users must provide API keys for OpenAI and Meshy. An NVIDIA GPU with CUDA is recommended for 3D tasks. The absence of a specified license is a significant adoption caveat.
1 day ago
Inactive