Agent-Driver by physical-superintelligence-lab

Agent-Driver: LLM agent for autonomous driving research

Created 2 years ago

292 stars

Top 90.5% on SourcePulse

Project Summary

Agent-Driver proposes a paradigm shift in autonomous driving by leveraging Large Language Models (LLMs) as a cognitive agent. This approach aims to integrate human-like reasoning and experiential knowledge into driving systems, targeting researchers and developers in the autonomous driving domain. The primary benefit is achieving more nuanced, human-like driving behavior and improved performance over traditional perception-prediction-planning pipelines.

How It Works

Agent-Driver utilizes LLMs to process driving scenarios, incorporating a versatile tool library accessible via function calls, a cognitive memory for common sense, and a reasoning engine for chain-of-thought processing, task planning, motion planning, and self-reflection. This LLM-centric design allows for intuitive common sense and robust reasoning, enabling a more human-like approach to decision-making and motion planning.

Quick Start & Requirements

Installation: Clone the repository and install dependencies via pip install -r requirements.txt.
Data: Requires pre-cached nuScenes dataset, downloadable from Google Drive.
OpenAI API: An OpenAI API account with associated API keys is mandatory for fine-tuning and inference. Keys must be configured in agentdriver/llm_core/api_keys.py.
Fine-tuning: Requires fine-tuning a GPT-based motion planner using OpenAI's API, which incurs costs. The project provides scripts to automate this process.
Inference: Can be performed via a Jupyter notebook (agentdriver/unit_test/test_lanuage_agent.ipynb) or by running inference scripts.
Evaluation: Requires running inference scripts to generate predictions, followed by an evaluation script.
Resources: Fine-tuning costs are estimated to be under $10 USD for 10% of the data.

Highlighted Details

Outperforms state-of-the-art driving methods on the nuScenes benchmark by a significant margin.
Demonstrates superior interpretability and few-shot learning capabilities.
Integrates a cognitive memory and a reasoning engine with self-reflection capabilities.
Leverages LLMs for end-to-end driving decision-making.

Maintenance & Community

The project is associated with an arXiv preprint. Links to community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The README does not explicitly state the license type. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The system's reliance on OpenAI's API means it is dependent on their service availability and pricing. Fine-tuning requires financial investment. The project is presented as an arXiv preprint, suggesting it may be in an early research stage.

Health Check

Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days