StateTransformer  by Tsinghua-MARS-Lab

Motion planner research paper using mixture of experts for autonomous driving

created 2 years ago
283 stars

Top 93.3% on sourcepulse

GitHubView on GitHub
Project Summary

StateTransformer-2 (STR2) addresses the generalization limitations of data-driven motion planners for autonomous driving. It offers a scalable, decoder-only approach using a Vision Transformer (ViT) encoder and a Mixture-of-Experts (MoE) causal Transformer architecture, targeting researchers and engineers in autonomous driving. The MoE backbone improves generalization by routing expert computations during training, leading to better performance on complex and few-shot driving scenarios.

How It Works

STR2 employs a ViT encoder for environmental raster representation and an MoE causal Transformer for autoregressive motion planning. This architecture allows the model to learn different explicit rewards for motion planning. The MoE design specifically helps mitigate modality collapse and balance rewards through expert routing, contributing to more robust and generalized planning capabilities.

Quick Start & Requirements

  • Installation: Install PyTorch with CUDA (e.g., torch==1.9.0+cu111), then pip install -r requirements.txt. Install the package with pip install -e .. NuPlan-Devkit requires additional packages (aioboto3, retry, aiofiles, bokeh==2.4.1).
  • Dependencies: PyTorch with CUDA, NuScenes and Waymo datasets (specific processing scripts provided).
  • Dataset: NuPlan dataset can be downloaded from a provided URL. Data processing scripts (generation.py) are available for converting .db files to .pkl and .arrow formats.
  • Links: Demonstration Video, Hugging Face Checkpoints.

Highlighted Details

  • Achieves state-of-the-art performance on the NuPlan dataset in both closed-loop and open-loop evaluations.
  • Demonstrates consistent accuracy improvements with increased data and model size, indicating strong scalability.
  • Utilizes FlashAttention-2 for optimized attention computation.
  • Supports training with various Mixtral model sizes.

Maintenance & Community

The project is associated with Tsinghua-MARS-Lab. The primary author is Qiao Sun. The project is based on previous work, StateTransformer, with code available via a specific commit hash.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The LiAuto dataset is not publicly available. The NuPlan dataset requires significant preprocessing using provided scripts. The project relies on specific versions of PyTorch and CUDA, and the dataset processing pipeline is complex.

Health Check
Last commit

8 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
2
Star History
22 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.