Agent-Driver  by USC-GVL

Agent-Driver: LLM agent for autonomous driving research

created 1 year ago
268 stars

Top 96.5% on sourcepulse

GitHubView on GitHub
Project Summary

Agent-Driver proposes a paradigm shift in autonomous driving by leveraging Large Language Models (LLMs) as a cognitive agent. This approach aims to integrate human-like reasoning and experiential knowledge into driving systems, targeting researchers and developers in the autonomous driving domain. The primary benefit is achieving more nuanced, human-like driving behavior and improved performance over traditional perception-prediction-planning pipelines.

How It Works

Agent-Driver utilizes LLMs to process driving scenarios, incorporating a versatile tool library accessible via function calls, a cognitive memory for common sense, and a reasoning engine for chain-of-thought processing, task planning, motion planning, and self-reflection. This LLM-centric design allows for intuitive common sense and robust reasoning, enabling a more human-like approach to decision-making and motion planning.

Quick Start & Requirements

  • Installation: Clone the repository and install dependencies via pip install -r requirements.txt.
  • Data: Requires pre-cached nuScenes dataset, downloadable from Google Drive.
  • OpenAI API: An OpenAI API account with associated API keys is mandatory for fine-tuning and inference. Keys must be configured in agentdriver/llm_core/api_keys.py.
  • Fine-tuning: Requires fine-tuning a GPT-based motion planner using OpenAI's API, which incurs costs. The project provides scripts to automate this process.
  • Inference: Can be performed via a Jupyter notebook (agentdriver/unit_test/test_lanuage_agent.ipynb) or by running inference scripts.
  • Evaluation: Requires running inference scripts to generate predictions, followed by an evaluation script.
  • Resources: Fine-tuning costs are estimated to be under $10 USD for 10% of the data.

Highlighted Details

  • Outperforms state-of-the-art driving methods on the nuScenes benchmark by a significant margin.
  • Demonstrates superior interpretability and few-shot learning capabilities.
  • Integrates a cognitive memory and a reasoning engine with self-reflection capabilities.
  • Leverages LLMs for end-to-end driving decision-making.

Maintenance & Community

The project is associated with an arXiv preprint. Links to community channels or roadmaps are not explicitly provided in the README.

Licensing & Compatibility

The README does not explicitly state the license type. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The system's reliance on OpenAI's API means it is dependent on their service availability and pricing. Fine-tuning requires financial investment. The project is presented as an arXiv preprint, suggesting it may be in an early research stage.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
17 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.