MapTR by hustvl

Research paper for online HD map construction

Created 4 years ago

1,524 stars

Top 26.4% on SourcePulse

Project Summary

MapTR is an end-to-end framework for online vectorized High-Definition (HD) map construction, targeting researchers and developers in autonomous driving. It offers a unified approach to model map elements as point sets with permutation equivalence, enabling accurate shape description and stable learning, while achieving real-time inference speeds.

How It Works

MapTR employs a permutation-equivalent modeling strategy, treating map elements as point sets invariant to permutation. This is achieved through a hierarchical query embedding scheme that encodes structured map information and utilizes hierarchical bipartite matching for element learning. Auxiliary one-to-many matching and dense supervision are incorporated to accelerate convergence and handle arbitrary shapes.

Quick Start & Requirements

Installation: Follow instructions in the README for setup.
Prerequisites: PyTorch, mmdetection3d, and specific dataset formats (nuScenes, Argoverse2). GPU acceleration is essential.
Resources: Training and inference require significant GPU memory (e.g., 10GB+ for MapTR-tiny, 20GB+ for MapTRv2) and compute.
Links: MapTRv2 Branch, VMA (Annotation Framework)

Highlighted Details

Achieves state-of-the-art performance on nuScenes and Argoverse2 datasets.
MapTRv2 demonstrates improved performance and faster convergence, with options for centerline modeling.
Supports various BEV encoders including GKT, bevpool, and BEVFusion.
Extensible to a general map annotation framework (VMA).

Maintenance & Community

The project has seen significant development with MapTRv2 and related works like VAD and DiffusionDrive. Links to arXiv preprints and accepted publications (ICLR'23 Spotlight, IJCV'24) are provided. Community interaction channels are not explicitly listed.

Licensing & Compatibility

The repository is released under a permissive license, allowing for commercial use and integration with closed-source systems. Specific license details are not explicitly stated in the README, but the permissive nature is implied by its open-source release and academic citations.

Limitations & Caveats

The project is primarily focused on camera-based input, with lidar modality code noted as a future or incomplete feature. Some performance metrics (FPS) are marked as "WIP" (Work In Progress) for specific configurations.

MapTR by hustvl

Explore Similar Projects

Shapeshift by rectanglehq

BotanicGarden by robot-pesg

Segment-Any-Point-Cloud by youquanl

dekart by dekart-xyz

MyScaleDB by myscale

vlmaps by vlmaps

redis-om-dotnet by redis

AirSLAM by sair-lab

marqo by marqo-ai

Pointcept by Pointcept

slam_toolbox by SteveMacenski

deeplake by activeloopai