Senna  by hustvl

Autonomous driving research paper integrating vision-language models

created 9 months ago
430 stars

Top 70.1% on sourcepulse

GitHubView on GitHub
Project Summary

Senna integrates Large Vision-Language Models (VLMs) with end-to-end autonomous driving systems to enhance planning safety, robustness, and generalization. It targets researchers and developers in autonomous driving and AI, offering improved cross-scenario transferability and state-of-the-art planning performance.

How It Works

Senna bridges VLMs with traditional end-to-end driving models by leveraging a multi-stage fine-tuning process. It uses a VLM (specifically LLaVA-v1.6-34b) to generate scene descriptions and planning explanations, which are then used to fine-tune an end-to-end driving model. This approach aims to imbue the driving system with a more nuanced understanding of driving scenarios, leading to safer and more robust planning decisions.

Quick Start & Requirements

  • Install: git clone git@github.com:hustvl/Senna.git, conda create -n senna python=3.10 -y, conda activate senna, pip install -r requirements.txt.
  • Prerequisites: Python 3.10, LLaVA-v1.6-34b for data generation.
  • Resources: Full-parameter fine-tuning is recommended but requires significant GPU memory; LoRA fine-tuning is an alternative for limited hardware (e.g., 24GB GPU).
  • Links: arXiv Paper, HuggingFace Models.

Highlighted Details

  • Achieves SOTA planning performance.
  • Demonstrates strong cross-scenario generalization and transferability.
  • Supports multi-view inputs (6 views).
  • Offers both full-parameter and LoRA fine-tuning options.

Maintenance & Community

The project is associated with Huazhong University of Science and Technology and Horizon Robotics. The code and weights were released in December 2024.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify a license, which may impact commercial adoption. The data generation script relies on LLaVA-v1.6-34b, a large model that requires substantial computational resources.

Health Check
Last commit

7 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
3
Star History
61 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.