TravelUAV  by prince687028

Realistic UAV Vision-Language Navigation platform and benchmark

Created 1 year ago
255 stars

Top 98.8% on SourcePulse

GitHubView on GitHub
Project Summary

This project addresses the challenges in realistic Unmanned Aerial Vehicle (UAV) Vision-Language Navigation (VLN) by providing a comprehensive solution. It targets researchers and engineers working on embodied AI, robotics, and autonomous systems, offering a platform, benchmark, and methodology to advance VLN capabilities in complex, real-world UAV scenarios. The benefit lies in enabling more robust and practical UAV navigation guided by natural language instructions.

How It Works

The project introduces a realistic UAV simulation platform built upon AirSim, coupled with the UAV-Flow benchmark for language-conditioned UAV imitation learning. It employs a methodology centered around a Multimodal Large Language Model (MLLM) to interpret language instructions and visual inputs, facilitating navigation tasks. This approach aims to bridge the gap between simulated and real-world UAV operations by focusing on realistic environmental interactions and visual perception.

Quick Start & Requirements

  • Primary Install: Create a conda environment (conda create -n llamauav python=3.10 -y), activate it (conda activate llamauav), and install PyTorch with CUDA 11.8 support (pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118). Install other dependencies via pip install -r requirement.txt. Follow LLaMA-UAV setup instructions separately.
  • Prerequisites: Python 3.10, PyTorch with CUDA 11.8, AirSim Python API (requires a specific fix), GroundingDINO model (groundingdino_swint_ogc.pth), downloaded simulator environments (CARLA, closeloop, extra_envs), and LLaMA-UAV model dependencies.
  • Setup: Requires downloading large simulator environments and models, configuring AirSim environment server paths (AirVLNSimulatorServerTool.py), and potentially significant setup time for dependencies and environment configuration.
  • Links: Paper: https://arxiv.org/abs/2410.07087, UAV-Flow Benchmark: https://prince687028.github.io/UAV-Flow.

Highlighted Details

  • Introduces UAV-Flow, the first real-world benchmark specifically for language-conditioned UAV imitation learning.
  • Provides a complete system including a realistic UAV simulation platform, a benchmark dataset, and an MLLM-based navigation method.
  • Addresses challenges in realistic UAV vision-language navigation, moving beyond simpler simulated environments.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or roadmap are provided in the README. The project is based on AirVLN and LLaMA-VID repositories.

Licensing & Compatibility

The README does not specify a software license. This lack of explicit licensing information presents a significant adoption blocker, particularly for commercial use or integration into closed-source projects, as terms of use and distribution are undefined.

Limitations & Caveats

The setup process appears complex, requiring specific versions of PyTorch with CUDA, external model downloads, and careful configuration of the AirSim simulator. The project relies on external, potentially large, simulator environment downloads and specific model weights, which are not directly included. The absence of a stated license is a critical caveat.

Health Check
Last Commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
13 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.