OM1  by OpenMind

Modular AI runtime for robots

Created 8 months ago
316 stars

Top 85.4% on SourcePulse

GitHubView on GitHub
Project Summary

OpenMind/OM1 is a modular AI runtime designed to simplify the creation and deployment of multimodal AI agents for robots and digital environments. It targets developers building human-focused robots, enabling them to easily integrate diverse data inputs and control physical actions, with a focus on upgradeability and adaptability across various hardware platforms.

How It Works

OM1 employs a modular architecture built with Python, allowing for seamless integration of new data sources and sensors. It supports hardware integration through plugins, connecting to middleware like ROS2, Zenoh, and CycloneDDS, with a recommendation for Zenoh. The system processes diverse inputs (web data, sensors, voice) and translates them into actions (motion, speech) via pre-configured endpoints for various AI models, including OpenAI's GPT-4o and multiple VLMs. A web-based debugger, WebSim, provides visual monitoring of the system's operation.

Quick Start & Requirements

  • Install: Clone the repository, initialize submodules, and use uv for environment management and installation.
  • Prerequisites: uv package manager, portaudio and ffmpeg (macOS/Linux). An OpenMind API key is required, configured via config/spot.json5 or a .env file.
  • Launch: Run uv run src/run.py spot for the example Spot agent.
  • Docs: Technical Paper, Documentation, and a Getting Started guide are available.

Highlighted Details

  • Supports integration with various robot hardware via plugins, assuming a high-level SDK for elemental movement and action commands.
  • Offers pre-configured endpoints for popular AI services like OpenAI's GPT-4o and multiple VLMs.
  • Includes WebSim for real-time visual debugging and monitoring.
  • Designed for extensibility, allowing custom agents and robot configurations through JSON5 files.

Maintenance & Community

  • The project is hosted on GitHub with a contributing guide. Community interaction channels include X and Discord.

Licensing & Compatibility

  • Licensed under the MIT License, a permissive license allowing free use, modification, and distribution, suitable for commercial use and closed-source linking.

Limitations & Caveats

  • Interfacing with new robot hardware requires a suitable Hardware Abstraction Layer (HAL); if one doesn't exist, traditional robotics approaches like RL and simulation environments may be necessary to create one. The project is primarily developed on specific macOS and Linux platforms, though it aims for broader compatibility.
Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
40
Issues (30d)
15
Star History
75 stars in the last 30 days

Explore Similar Projects

Starred by Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research), Elvis Saravia Elvis Saravia(Founder of DAIR.AI), and
15 more.

semantic-kernel by microsoft

0.3%
26k
SDK for building intelligent AI agents and multi-agent systems
Created 2 years ago
Updated 1 day ago
Feedback? Help us improve.