ChatSim by yifanlu0227

Scene simulation for autonomous driving via LLM-agent collaboration

Created 2 years ago

417 stars

Top 70.3% on SourcePulse

Project Summary

ChatSim provides an editable scene simulation framework for autonomous driving, leveraging LLM-agent collaboration to modify driving scenarios. It targets researchers and developers in autonomous driving and computer vision who need to generate and test diverse driving situations. The system allows for intuitive scene editing through natural language prompts, enabling rapid iteration and evaluation of driving policies.

How It Works

ChatSim integrates Large Language Models (LLMs) with specialized agents to interpret user prompts and modify 3D driving scenes. It utilizes advanced rendering techniques, including McNeRF and 3D Gaussian Splatting, for high-fidelity scene generation. The framework supports foreground object insertion and background scene manipulation, offering flexibility in scenario creation.

Quick Start & Requirements

Installation: Clone the repository recursively (git clone --recursive ...), set up a conda environment (conda create -n chatsim python=3.9 git-lfs), and install dependencies via requirements.txt. Specific background rendering engines (McNeRF or 3D Gaussian Splatting) and inpainting tools require additional setup steps. Blender 3.5.1+ is also required.
Prerequisites: Ubuntu >= 20.04, Python >= 3.8, PyTorch >= 1.13, CUDA >= 11.6, OpenAI API Key (or alternative). Data preparation involves downloading Waymo dataset and associated calibration files.
Resources: Requires significant disk space for Waymo data and 3D assets (several GBs). GPU with CUDA support is essential for rendering and training.
Documentation: Arxiv, Project Page, Video

Highlighted Details

Integrates 3D Gaussian Splatting for significantly faster background rendering (50 frames in ~30s).
Supports parallel Blender rendering for foreground elements, achieving 50 frames in ~5 minutes.
Offers optional trajectory tracking module for smoother, more realistic agent motion.
Includes optional skydome estimation for realistic lighting.

Maintenance & Community

The project is associated with CVPR 2024. Links to community resources like Discord/Slack are not explicitly provided in the README.

Licensing & Compatibility

The project's code is likely under a permissive license, but the included 3D assets are noted as collected from the internet with efforts to address copyright, with a disclaimer to contact them if copyright infringement occurs. Commercial use of assets may require further verification.

Limitations & Caveats

3D Gaussian Splatting may exhibit artifacts with strong perspective shifts. The use_surrounding_lighting feature for foreground agents is currently limited to single-vehicle insertions and can impact rendering speed. The multi-round wrapper code is still under development.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

6 stars in the last 30 days