MapTR  by hustvl

Research paper for online HD map construction

created 3 years ago
1,348 stars

Top 30.4% on sourcepulse

GitHubView on GitHub
Project Summary

MapTR is an end-to-end framework for online vectorized High-Definition (HD) map construction, targeting researchers and developers in autonomous driving. It offers a unified approach to model map elements as point sets with permutation equivalence, enabling accurate shape description and stable learning, while achieving real-time inference speeds.

How It Works

MapTR employs a permutation-equivalent modeling strategy, treating map elements as point sets invariant to permutation. This is achieved through a hierarchical query embedding scheme that encodes structured map information and utilizes hierarchical bipartite matching for element learning. Auxiliary one-to-many matching and dense supervision are incorporated to accelerate convergence and handle arbitrary shapes.

Quick Start & Requirements

  • Installation: Follow instructions in the README for setup.
  • Prerequisites: PyTorch, mmdetection3d, and specific dataset formats (nuScenes, Argoverse2). GPU acceleration is essential.
  • Resources: Training and inference require significant GPU memory (e.g., 10GB+ for MapTR-tiny, 20GB+ for MapTRv2) and compute.
  • Links: MapTRv2 Branch, VMA (Annotation Framework)

Highlighted Details

  • Achieves state-of-the-art performance on nuScenes and Argoverse2 datasets.
  • MapTRv2 demonstrates improved performance and faster convergence, with options for centerline modeling.
  • Supports various BEV encoders including GKT, bevpool, and BEVFusion.
  • Extensible to a general map annotation framework (VMA).

Maintenance & Community

The project has seen significant development with MapTRv2 and related works like VAD and DiffusionDrive. Links to arXiv preprints and accepted publications (ICLR'23 Spotlight, IJCV'24) are provided. Community interaction channels are not explicitly listed.

Licensing & Compatibility

The repository is released under a permissive license, allowing for commercial use and integration with closed-source systems. Specific license details are not explicitly stated in the README, but the permissive nature is implied by its open-source release and academic citations.

Limitations & Caveats

The project is primarily focused on camera-based input, with lidar modality code noted as a future or incomplete feature. Some performance metrics (FPS) are marked as "WIP" (Work In Progress) for specific configurations.

Health Check
Last commit

5 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
4
Star History
82 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.