Discover and explore top open-source AI tools and projects—updated daily.
allenaiPromptable 3D detection for real-world scenarios
Top 56.8% on SourcePulse
Summary
WildDet3D addresses the challenge of scaling promptable 3D object detection in diverse, real-world environments. It enables flexible, zero-shot detection using text, box, or point prompts, benefiting researchers and engineers in fields like robotics and AR/VR by providing adaptable 3D perception capabilities.
How It Works
The system employs a SAM3 backbone for segmentation and LingBot-Depth for monocular depth estimation, facilitating promptable 3D detection. It uniquely supports text, 2D box (geometric/exemplar), and point prompts, allowing for flexible querying of 3D scenes. This approach enables robust zero-shot transfer across varied datasets, including outdoor driving and indoor scenes.
Quick Start & Requirements
Installation involves cloning the repository with submodules, creating a Conda environment (Python 3.11), and installing specific versions of PyTorch (CUDA 12.1), vis4d, and its CUDA ops, followed by other dependencies. Key prerequisites include CUDA-enabled GPUs. Training requires 8 GPUs. Inference can achieve up to a 3.0x speedup using BF16 autocast and torch.compile, though initial compilation may take ~17 minutes.
Highlighted Details
torch.compile.Maintenance & Community
The project shows recent activity with updates in May 2026, including new evaluation configurations and integration demos. It is a collaborative effort involving researchers from Allen Institute for AI and the University of Washington. No specific community channels (e.g., Discord, Slack) are listed.
Licensing & Compatibility
The codebase and models are licensed under the "SAM License" and are explicitly intended for research and educational use, with adherence to Ai2's Responsible Use Guidelines. This license may restrict commercial applications.
Limitations & Caveats
The primary limitation is the restrictive "SAM License," limiting usage to research and education. Certain torch.compile optimization modes are unsupported due to dynamic shape requirements in the detection head. Installation requires careful management of specific library versions and building CUDA extensions from source.
6 days ago
Inactive
facebookresearch