ROS-LLM  by Auromix

ROS framework for embodied intelligence

Created 2 years ago
695 stars

Top 49.0% on SourcePulse

GitHubView on GitHub
Project Summary

ROS-LLM is a framework for integrating Large Language Models (LLMs) into ROS-based robots, enabling natural language control and decision-making for embodied intelligence applications. It targets roboticists and developers seeking to quickly add conversational AI and LLM-driven behaviors to their robots, with a stated goal of enabling operation in under ten minutes.

How It Works

The framework leverages LLMs like GPT-4 and ChatGPT to interpret natural language commands and translate them into robot actions, including motion and navigation. It provides a simplified interface for integrating robot-specific functions, allowing LLMs to manage tasks based on their interpretation of user input. This approach aims to abstract complex robot control logic behind an LLM interface for rapid prototyping and interaction.

Quick Start & Requirements

  • Install: Clone the repository, run dependencies_install.sh, configure OpenAI API key via config_openai_api_key.sh, optionally configure AWS settings, install openai-whisper and setuptools-rust, then build the ROS workspace using colcon build.
  • Prerequisites: ROS, Python, OpenAI API key. Optional AWS configuration for cloud ASR.
  • Demo: Source workspace setup and launch chatgpt_with_turtle_robot.launch.py.
  • Links: GitHub Repo

Highlighted Details

  • ROS Integration for broad robot compatibility.
  • Supports LLMs like GPT-4 and ChatGPT.
  • Facilitates natural language interaction and LLM-based control.
  • Simplified extensibility for custom robot functions.
  • Quick development target of under ten minutes.

Maintenance & Community

The project is maintained by Auromix. Contributions are welcome. Further development plans include agent mechanisms, feedback channels, navigation interfaces, sensor input, vision model integration (e.g., Palm-e), and continuous optimization.

Licensing & Compatibility

Licensed under the Apache License, Version 2.0. This license permits commercial use and linking with closed-source software.

Limitations & Caveats

The framework's core functionality relies on external LLM APIs (e.g., OpenAI), incurring potential costs and latency. Optional AWS configuration is noted for ASR, suggesting potential dependencies on cloud services for certain features. Future development plans indicate that navigation and vision-based inputs are not yet fully integrated.

Health Check
Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
12 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.