Research paper for synthesizing data with diffusion models and perception annotations
Top 86.3% on sourcepulse
DatasetDM provides official code for synthesizing high-quality perception data with annotations using diffusion models, targeting researchers and practitioners in computer vision. It enables the generation of diverse datasets for tasks like instance segmentation, semantic segmentation, and depth estimation, significantly enhancing model training with synthetic data.
How It Works
DatasetDM leverages diffusion models, specifically Stable Diffusion 1.4, to generate synthetic images. It incorporates a P-Decoder for generating segmentation masks and utilizes GPT-4 to enhance prompt diversity, leading to more varied and realistic synthetic data. This approach allows for targeted data generation for specific tasks and datasets, improving the efficiency and effectiveness of data augmentation.
Quick Start & Requirements
conda create -n DatasetDM python=3.8
), install PyTorch 1.9.1 with CUDA 11.1 (pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
), and then install other requirements (python -m pip install -r requirements.txt
)../dataset/ckpts
. A specific version of diffusers
(0.3.0) is recommended, or using the modified version in ./model/diffusers
.Highlighted Details
Maintenance & Community
The project is associated with NeurIPS 2023. No specific community links (Discord, Slack) or active maintenance signals are provided in the README.
Licensing & Compatibility
The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project notes potential errors due to Diffuser version updates and recommends a specific older version (0.3.0) or using their modified code. The release of code was initially planned within three months of September 2023.
1 year ago
1 day