Discover and explore top open-source AI tools and projects—updated daily.
End-to-end autonomous driving with a VLA model
Top 74.4% on SourcePulse
OpenDriveVLA aims to provide an end-to-end solution for autonomous driving using a large vision-language-action model. This project targets researchers and developers in the autonomous driving and AI fields, offering a unified framework for processing visual, linguistic, and action-based data.
How It Works
The project leverages a large vision-language-action model architecture, integrating components from established libraries like LLaVA-NeXT, Qwen2.5, and UniAD. This approach allows for a holistic understanding of the driving environment and the generation of appropriate driving actions, potentially simplifying the complex pipeline of traditional autonomous driving systems.
Quick Start & Requirements
Environment setup is available, with customized libraries for mmcv and mmdet3d to ensure compatibility with PyTorch 2.1.2, Transformers, and Deepspeed. Inference code, checkpoints, and training code are planned for release.
Highlighted Details
Maintenance & Community
The project is actively under development, with a roadmap including the release of model code, checkpoints, inference, and training code.
Licensing & Compatibility
The project's license is not specified in the provided README.
Limitations & Caveats
The release of core components such as inference code, checkpoints, and training code is still pending.
1 month ago
Inactive