Discover and explore top open-source AI tools and projects—updated daily.
RenzKaVision-only autonomous driving with language-action alignment
Top 85.6% on SourcePulse
SimLingo addresses vision-only closed-loop autonomous driving by integrating language understanding and action generation. It targets researchers and engineers, offering state-of-the-art driving performance alongside multimodal AI capabilities, including VQA and instruction following.
How It Works
This project implements a Vision-Language-Action (VLA) model within the CARLA simulator, building upon the CARLA Garage framework. It leverages the PDM-lite expert for data collection and introduces "Action Dreaming" for enhanced language-action alignment. The approach enables a vision-only system to perform complex driving tasks and respond to linguistic queries or instructions.
Quick Start & Requirements
setup_carla.sh), create a Conda environment (environment.yaml), and install PyTorch (2.2.0) and Flash-attn (2.7.0.post2).CARLA_ROOT, WORK_DIR, and PYTHONPATH.https://huggingface.co/datasets/RenzKa/simlingo.Highlighted Details
Maintenance & Community
The README provides no direct links to community channels (Discord, Slack) or a roadmap. Maintenance status is uncertain, with a note indicating potential future cleanup of evaluation scripts.
Licensing & Compatibility
The repository's license is not specified in the README, making its terms for use, modification, and distribution unclear. Commercial use compatibility is therefore undetermined.
Limitations & Caveats
The released model and dataset are reproductions, leading to slight deviations from original paper results. Language evaluation scripts may be subject to future cleanup. Data generation scripts for VQA and commentary are tightly coupled to specific simulator state information, limiting their reusability with custom datasets. The Bench2Drive benchmark is noted as a "training" benchmark due to potential data leakage.
4 months ago
Inactive
OpenDriveLab
allenzren