llama_ros by mgonzs13

ROS 2 integration for GGUF LLMs and VLMs

Created 3 years ago

257 stars

Top 98.3% on SourcePulse

Project Summary

This repository provides ROS 2 packages for integrating llama.cpp and llava.cpp (GGUF LLMs and VLMs) into robotics applications. It targets ROS 2 developers seeking to leverage powerful, optimized language and vision models directly within their robotic systems, offering benefits like real-time LoRA adaptation and multimodal understanding.

How It Works

The project exposes llama.cpp and llava.cpp functionalities through ROS 2 nodes (llama_node, llava_node). It supports loading models in the GGUF format, enabling features such as GBNF grammars for constrained generation and speculative decoding for accelerated inference. The integration allows for seamless incorporation of LLM/VLM capabilities, including image and audio processing, into ROS 2 workflows.

Quick Start & Requirements

Installation: Requires ROS 2, Python, and optionally CUDA Toolkit. Installation involves cloning the repository, synchronizing Python dependencies with uv sync, installing ROS dependencies with rosdep, and building with colcon build. CUDA support is enabled via colcon build --cmake-args -DGGML_CUDA=ON. Docker images are also available for various ROS 2 distros.
Prerequisites: CUDA Toolkit (for GPU acceleration), ROS 2 (Humble, Iron, Jazzy, Kilted, Rolling).
Links:
- Official Docs: https://mgonzs13.github.io/llama_ros/latest
- Docker Hub: https://hub.docker.com/r/mgons/llama_ros/tags

Highlighted Details

Multimodal Support: Integrates llava.cpp for Visual Language Models (VLMs), enabling image and audio input processing.
Speculative Decoding: Accelerates text generation by using a smaller draft model to predict tokens, verifiable in parallel by the main model.
LoRA Adapters: Supports dynamic loading and scaling of LoRA adapters for real-time model fine-tuning.
LangChain Integration: Offers ROS 2 clients and LangChain integrations for LLM/VLM functionalities, RAG, embeddings, and reranking.
ROS 2 CLI: Includes ros2 llama launch and ros2 llama prompt commands for streamlined interaction.

Maintenance & Community

The project shows signs of active maintenance with CI/CD pipelines across multiple ROS 2 distributions and recent commits. It lists multiple contributors, indicating a collaborative effort. No specific community channels (like Discord/Slack) are detailed in the README.

Licensing & Compatibility

The project is released under the MIT License, which is permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

GPU acceleration requires manual CUDA Toolkit installation and specific build flags. Speculative decoding is not compatible with embedding or reranking models and requires context.n_parallel: 1. Running large language models typically demands substantial computational resources (CPU, RAM, VRAM).

llama_ros by mgonzs13

Explore Similar Projects

keras-llm-robot by smalltong02

llava-phi by xmoanvaf

lmm-r1 by TideDra

z-waif by SugarcaneDefender

rai by RobotecAI

open-r1-multimodal by EvolvingLMMs-Lab

my-neuro by morettt

py-gpt by szczyglis-dev

awesome-digital-human-live2d by wan-h

LMFlow by OptimalScale

inference by xorbitsai

OM1 by OpenMind