aiavatarkit by uezo

SDK for building AI-based conversational avatars

Created 2 years ago

503 stars

Top 61.9% on SourcePulse

Project Summary

This project provides a framework for building AI-powered conversational avatars, targeting developers and metaverse enthusiasts. It enables rapid creation of interactive avatars capable of real-time voice conversation, facial expressions, and platform integration, significantly reducing development time for complex AI-driven characters.

How It Works

AIAvatarKit orchestrates multiple AI services, including LLMs (ChatGPT, Claude, Gemini, Dify), Text-to-Speech (VOICEVOX, Azure, Google, OpenAI), and Speech-to-Text (Azure, Google, OpenAI). It uses a modular design, allowing users to swap components and integrate custom behaviors. The framework handles audio input/output, speech processing, LLM interaction, and output synthesis (voice, facial expressions via OSC for VRChat), creating a cohesive conversational experience.

Quick Start & Requirements

Install: pip install git+https://github.com/uezo/aiavatarkit.git@v0.6.5
Requirements: Python 3.10+, VOICEVOX API, OpenAI API key (for ChatGPT and Speech-to-Text).
Quick Start: See README for example run.py and setup.

Highlighted Details

Supports multiple LLMs (ChatGPT, Claude, Gemini, Dify) and TTS/STT services via LiteLLM and custom implementations.
Enables real-time facial expression control for VRChat using Avatar OSC.
Features include vision processing (screenshots/camera input for AI), function calling, long-term memory, and custom behavior hooks.
Offers RESTful API and WebSocket server options for distributed deployment and client interaction.

Maintenance & Community

The project is actively maintained by uezo. Further community engagement details (Discord/Slack, roadmap) are not explicitly detailed in the README.

Licensing & Compatibility

The project is licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The README notes that PyPI versions may lag behind GitHub releases during transition periods. Some features like "Animation" and "Raspberry Pi" deployment are marked as "Now writing..." indicating incomplete implementation. Support for Claude and Gemini is limited to specific APIs (not Amazon Bedrock or Vertex AI) unless using API proxies like LiteLLM.

Health Check

Last Commit

6 days ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

49 stars in the last 30 days