MultiGen  by LiXiaoYaoCareFree

On-premise AI agent platform enabling multimodal collaboration

Created 4 months ago
271 stars

Top 94.9% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

MultiGen provides a general-purpose AI agent system designed for fully private, on-premise deployment. It enables multimodal agent collaboration securely, ensuring data remains within the user's infrastructure through sandboxed execution and flexible LLM integration, offering a self-hosted alternative to cloud-based solutions.

How It Works

The system utilizes a Planner agent to decompose user goals into a series of executable steps, which are then processed by a ReAct agent. This agent iteratively reasons and acts upon each step using available tools. A key security feature is the sandboxed execution environment: all actions involving shell commands, browser interactions, or file operations are confined within isolated Docker containers, preventing direct access to the host system. MultiGen natively supports Agent-to-Agent (A2A) and Model Communication Protocol (MCP) for seamless integration with other agents and external services.

Quick Start & Requirements

Deployment is facilitated via Docker Compose. Prerequisites include Docker (>= 20.10), Docker Compose (>= 2.0), and an API key for any OpenAI-compatible LLM. Users should clone the master branch for local development/evaluation or the online branch for production deployments. Configuration involves setting environment variables (e.g., API keys, Tencent COS credentials) and adjusting api/config.yaml for LLM and MCP server details. The full stack is launched with docker compose up -d --build.

Highlighted Details

  • Architecture: Planner + ReAct dual-agent flow for goal decomposition and execution.
  • Security: Actions are proxied to isolated Docker sandboxes (Ubuntu + Chrome + VNC), ensuring host isolation.
  • Interoperability: Native MCP and A2A support enables tool orchestration and agent delegation.
  • Multimodality: Integrated tools for image, video, 3D model generation, and Text-to-Speech (TTS).
  • LLM Flexibility: Compatible with any OpenAI-compatible LLM endpoint, configurable via config.yaml.
  • Deployment: Single docker compose command deploys the entire stack, including UI, API, database, and sandbox.
  • Observability: Real-time execution progress and results are streamed via SSE to the UI; sessions are fully replayable and stored in PostgreSQL.

Maintenance & Community

Contributions are welcomed via issues and pull requests. Specific community channels (e.g., Discord, Slack) are not detailed in the README.

Licensing & Compatibility

Released under the MIT License, permitting commercial use and integration into closed-source projects. Users are responsible for ensuring compliance with the terms of any integrated third-party LLM or MCP services.

Limitations & Caveats

The master branch is designated for local use only; the online branch is the production-ready version. Features such as long-term memory/RAG plugins, multi-user workspace permissions, and a plugin marketplace are planned but not yet implemented.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
273 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.