SDK for building AI-based conversational avatars
Top 74.9% on sourcepulse
This project provides a framework for building AI-powered conversational avatars, targeting developers and metaverse enthusiasts. It enables rapid creation of interactive avatars capable of real-time voice conversation, facial expressions, and platform integration, significantly reducing development time for complex AI-driven characters.
How It Works
AIAvatarKit orchestrates multiple AI services, including LLMs (ChatGPT, Claude, Gemini, Dify), Text-to-Speech (VOICEVOX, Azure, Google, OpenAI), and Speech-to-Text (Azure, Google, OpenAI). It uses a modular design, allowing users to swap components and integrate custom behaviors. The framework handles audio input/output, speech processing, LLM interaction, and output synthesis (voice, facial expressions via OSC for VRChat), creating a cohesive conversational experience.
Quick Start & Requirements
pip install git+https://github.com/uezo/aiavatarkit.git@v0.6.5
run.py
and setup.Highlighted Details
Maintenance & Community
The project is actively maintained by uezo. Further community engagement details (Discord/Slack, roadmap) are not explicitly detailed in the README.
Licensing & Compatibility
The project is licensed under the MIT License, permitting commercial use and integration with closed-source applications.
Limitations & Caveats
The README notes that PyPI versions may lag behind GitHub releases during transition periods. Some features like "Animation" and "Raspberry Pi" deployment are marked as "Now writing..." indicating incomplete implementation. Support for Claude and Gemini is limited to specific APIs (not Amazon Bedrock or Vertex AI) unless using API proxies like LiteLLM.
1 day ago
Inactive