Multi-Agent-GPT  by YangXuanyi

A multimodal expert assistant GPT platform

Created 2 years ago
254 stars

Top 99.0% on SourcePulse

GitHubView on GitHub
Project Summary

Multi-Agent-GPT is a multimodal expert assistant platform built upon Retrieval Augmented Generation (RAG) and agentic frameworks. It addresses the need for sophisticated AI assistants that can process and interact with diverse data modalities, including text and images, while supporting local deployment and private database construction for enhanced data security and control.

How It Works

The platform integrates RAG with a flexible agent architecture, enabling advanced conversational capabilities across multiple data types. It leverages a suite of tools for tasks such as web searching, image generation, and image captioning, powered by integrations with models like ChatGPT, Dalle, Google Search, and BLIP, facilitating context-aware and multimodal interactions.

Quick Start & Requirements

  • Primary install / run command: After creating a Python 3.10 conda environment (conda create -n agent python=3.10) and activating it (conda activate agent), install dependencies using pip install -r ./requirements.txt.
  • Non-default prerequisites and dependencies: Requires Python 3.10, Anaconda, manual download and local placement of BLIP model files into the Models/BLIP directory, and API keys configured in a .env file for services like OpenAI and Google Search.
  • Demo: Launch the UI by running python ./web.py, which provides a local URL for browser access.
  • Links: A link to the BLIP model website and a roadmap are mentioned, but direct URLs are not provided in the README.

Highlighted Details

  • Multimodality: Supports text and image modalities, with planned integration for audio and video processing.
  • Agent Framework: Features single/multi-turn chat capabilities and a core agent structure.
  • Tool Integrations: Includes tools for web searching, image generation (via Dalle), and image captioning (via BLIP).
  • RAG & Deployment: Aims to incorporate private database functionality and offline deployment capabilities.
  • Tech Stack: Built using Python, PyTorch, Langchain, and Gradio.

Maintenance & Community

The project indicates active maintenance ("Maintained? - yes") and welcomes contributions ("PR's welcome"). Specific details regarding notable contributors, sponsorships, or community channels (e.g., Discord, Slack) are not present in the provided README content.

Licensing & Compatibility

The project is licensed under the MIT License. This license is generally permissive, allowing for commercial use and integration into closed-source applications.

Limitations & Caveats

Functionalities for audio and video processing, as well as RAG features for private databases and offline deployment, are marked as incomplete or under development. The setup requires manual downloading of the BLIP model and configuration of API keys.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
1
Star History
5 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.