OM1 by OpenMind

Modular AI runtime for robots

Created 1 year ago

2,457 stars

Top 18.6% on SourcePulse

Project Summary

OpenMind/OM1 is a modular AI runtime designed to simplify the creation and deployment of multimodal AI agents for robots and digital environments. It targets developers building human-focused robots, enabling them to easily integrate diverse data inputs and control physical actions, with a focus on upgradeability and adaptability across various hardware platforms.

How It Works

OM1 employs a modular architecture built with Python, allowing for seamless integration of new data sources and sensors. It supports hardware integration through plugins, connecting to middleware like ROS2, Zenoh, and CycloneDDS, with a recommendation for Zenoh. The system processes diverse inputs (web data, sensors, voice) and translates them into actions (motion, speech) via pre-configured endpoints for various AI models, including OpenAI's GPT-4o and multiple VLMs. A web-based debugger, WebSim, provides visual monitoring of the system's operation.

Quick Start & Requirements

Install: Clone the repository, initialize submodules, and use uv for environment management and installation.
Prerequisites: uv package manager, portaudio and ffmpeg (macOS/Linux). An OpenMind API key is required, configured via config/spot.json5 or a .env file.
Launch: Run uv run src/run.py spot for the example Spot agent.
Docs: Technical Paper, Documentation, and a Getting Started guide are available.

Highlighted Details

Supports integration with various robot hardware via plugins, assuming a high-level SDK for elemental movement and action commands.
Offers pre-configured endpoints for popular AI services like OpenAI's GPT-4o and multiple VLMs.
Includes WebSim for real-time visual debugging and monitoring.
Designed for extensibility, allowing custom agents and robot configurations through JSON5 files.

Maintenance & Community

The project is hosted on GitHub with a contributing guide. Community interaction channels include X and Discord.

Licensing & Compatibility

Licensed under the MIT License, a permissive license allowing free use, modification, and distribution, suitable for commercial use and closed-source linking.

Limitations & Caveats

Interfacing with new robot hardware requires a suitable Hardware Abstraction Layer (HAL); if one doesn't exist, traditional robotics approaches like RL and simulation environments may be necessary to create one. The project is primarily developed on specific macOS and Linux platforms, though it aims for broader compatibility.

Health Check

Last Commit

22 hours ago

Responsiveness

Inactive

Pull Requests (30d)

378

Issues (30d)

97

Star History

220 stars in the last 30 days

Explore Similar Projects

embodied-agents by mbodiai

Integrate SOTA AI models into robotics

Created 1 year ago

Updated 3 weeks ago

allchat by msveshnikov

Multimodal AI chat client connecting diverse models and tools

Created 1 year ago

Updated 11 months ago

agents by videosdk-live

Real-time multimodal conversational AI agents framework

Created 8 months ago

Updated 1 day ago

ROS-LLM by Auromix

ROS framework for embodied intelligence

Created 2 years ago

Updated 2 years ago

OpenEmbodied by gizwits

AI IoT solution for commercial use

Created 8 months ago

Updated 1 month ago

Stride-AI-Agents by joshpocock

AI agent framework for autonomous systems development

Created 1 year ago

Updated 1 year ago

Starred by

Jeffrey Morgan

Jeffrey Morgan(Cofounder of Ollama).

witsy by nbonamy

Desktop AI assistant for universal model control

Created 1 year ago

Updated 17 hours ago

Starred by

Gabriel Almeida

Gabriel Almeida(Cofounder of Langflow),

Alex Cheema

Alex Cheema(Cofounder of EXO Labs), and

4 more.

omi by BasedHardware

AI wearable for real-time audio capture and intelligent processing

Created 1 year ago

Updated 14 hours ago

Starred by

Ettore Di Giacinto

Ettore Di Giacinto(Author of LocalAI),

Justin Torre

Justin Torre(Cofounder of Helicone), and

2 more.

big-AGI by enricoros

AI suite for advanced AI/AGI functions, deployable on-prem or cloud

Created 2 years ago

Updated 23 hours ago

Starred by

Chaoyu Yang

Chaoyu Yang(Founder of Bento),

Nir Gazit

Nir Gazit(Cofounder of Traceloop), and

4 more.

pipecat by pipecat-ai

Open-source framework for building real-time voice and multimodal conversational AI agents

Created 2 years ago

Updated 1 day ago

chatgpt-web-midjourney-proxy by Dooy

One-UI for multimodal AI tasks

Created 2 years ago

Updated 2 weeks ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

Fay by xszyou

Open-source MCP framework for digital humans and LLM integration

Created 3 years ago

Updated 4 days ago

Feedback? Help us improve.