StateTransformer by Tsinghua-MARS-Lab

Motion planner research paper using mixture of experts for autonomous driving

Created 2 years ago

307 stars

Top 87.5% on SourcePulse

Project Summary

StateTransformer-2 (STR2) addresses the generalization limitations of data-driven motion planners for autonomous driving. It offers a scalable, decoder-only approach using a Vision Transformer (ViT) encoder and a Mixture-of-Experts (MoE) causal Transformer architecture, targeting researchers and engineers in autonomous driving. The MoE backbone improves generalization by routing expert computations during training, leading to better performance on complex and few-shot driving scenarios.

How It Works

STR2 employs a ViT encoder for environmental raster representation and an MoE causal Transformer for autoregressive motion planning. This architecture allows the model to learn different explicit rewards for motion planning. The MoE design specifically helps mitigate modality collapse and balance rewards through expert routing, contributing to more robust and generalized planning capabilities.

Quick Start & Requirements

Installation: Install PyTorch with CUDA (e.g., torch==1.9.0+cu111), then pip install -r requirements.txt. Install the package with pip install -e .. NuPlan-Devkit requires additional packages (aioboto3, retry, aiofiles, bokeh==2.4.1).
Dependencies: PyTorch with CUDA, NuScenes and Waymo datasets (specific processing scripts provided).
Dataset: NuPlan dataset can be downloaded from a provided URL. Data processing scripts (generation.py) are available for converting .db files to .pkl and .arrow formats.
Links: Demonstration Video, Hugging Face Checkpoints.

Highlighted Details

Achieves state-of-the-art performance on the NuPlan dataset in both closed-loop and open-loop evaluations.
Demonstrates consistent accuracy improvements with increased data and model size, indicating strong scalability.
Utilizes FlashAttention-2 for optimized attention computation.
Supports training with various Mixtral model sizes.

Maintenance & Community

The project is associated with Tsinghua-MARS-Lab. The primary author is Qiao Sun. The project is based on previous work, StateTransformer, with code available via a specific commit hash.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The LiAuto dataset is not publicly available. The NuPlan dataset requires significant preprocessing using provided scripts. The project relies on specific versions of PyTorch and CUDA, and the dataset processing pipeline is complex.

Health Check

Last Commit

5 months ago

Responsiveness

1 week

Pull Requests (30d)

Issues (30d)

Star History

2 stars in the last 30 days