Synthetic dataset for holistic indoor scene understanding research
Top 23.8% on sourcepulse
Hypersim is a large-scale, photorealistic synthetic dataset designed for comprehensive indoor scene understanding tasks. It provides detailed per-pixel ground truth labels for geometry, semantics, and materials, targeting researchers and engineers in computer vision and robotics. The dataset enables robust training and evaluation of models that require precise scene information, which is often difficult to obtain from real-world data.
How It Works
Hypersim leverages professional artist-created 3D scenes from Evermotion Archinteriors, rendering over 77,000 images across 461 indoor environments. It offers a unique factorization of each image into diffuse reflectance, diffuse illumination, and a non-diffuse residual term, facilitating advanced lighting and material analysis. The dataset includes dense per-pixel semantic instance segmentations, camera information, and 3D bounding boxes for semantic instances, all derived from the underlying scene geometry and V-Ray rendering pipeline.
Quick Start & Requirements
dataset_download_images.py
) and approximately 1.9TB of storage.h5py
, matplotlib
, pandas
, scikit-learn
, mayavi
(optional), opencv-python
, pillow
, joblib
, scipy
.args
, Armadillo
, Embree
, HDF5
, Octomap
, OpenEXR
(for High-Level Toolkit)._system_config.py
, system_config.inc
) to point to installed V-Ray and library paths.Highlighted Details
Maintenance & Community
The project is from Apple, with contributions noted from individuals like Mike Roberts, Jason Ramapuram, and Anurag Ranjan. Further community interaction details are not explicitly provided in the README.
Licensing & Compatibility
The Hypersim Dataset is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. The software toolkit's licensing is not explicitly detailed, but it notes that it does not depend on GPL-licensed portions of its open-source dependencies. Commercial use of the dataset is permitted under the CC-BY-SA 3.0 license.
Limitations & Caveats
The dataset requires significant storage (1.9TB) and a complex setup involving V-Ray. Some pipeline steps are platform-specific (Windows, macOS/Linux). The dataset may contain asset reuse across scenes, potentially affecting strict data independence in splits. Purchasing original 3D assets is required for obtaining ground truth triangle meshes.
1 week ago
Inactive