Senna by hustvl

Autonomous driving research paper integrating vision-language models

Created 1 year ago

513 stars

Top 61.0% on SourcePulse

Project Summary

Senna integrates Large Vision-Language Models (VLMs) with end-to-end autonomous driving systems to enhance planning safety, robustness, and generalization. It targets researchers and developers in autonomous driving and AI, offering improved cross-scenario transferability and state-of-the-art planning performance.

How It Works

Senna bridges VLMs with traditional end-to-end driving models by leveraging a multi-stage fine-tuning process. It uses a VLM (specifically LLaVA-v1.6-34b) to generate scene descriptions and planning explanations, which are then used to fine-tune an end-to-end driving model. This approach aims to imbue the driving system with a more nuanced understanding of driving scenarios, leading to safer and more robust planning decisions.

Quick Start & Requirements

Install: git clone git@github.com:hustvl/Senna.git, conda create -n senna python=3.10 -y, conda activate senna, pip install -r requirements.txt.
Prerequisites: Python 3.10, LLaVA-v1.6-34b for data generation.
Resources: Full-parameter fine-tuning is recommended but requires significant GPU memory; LoRA fine-tuning is an alternative for limited hardware (e.g., 24GB GPU).
Links: arXiv Paper, HuggingFace Models.

Highlighted Details

Achieves SOTA planning performance.
Demonstrates strong cross-scenario generalization and transferability.
Supports multi-view inputs (6 views).
Offers both full-parameter and LoRA fine-tuning options.

Maintenance & Community

The project is associated with Huazhong University of Science and Technology and Horizon Robotics. The code and weights were released in December 2024.

Licensing & Compatibility

The repository does not explicitly state a license in the README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify a license, which may impact commercial adoption. The data generation script relies on LLaVA-v1.6-34b, a large model that requires substantial computational resources.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

12 stars in the last 30 days