aiavatarkit  by uezo

SDK for building AI-based conversational avatars

created 2 years ago
389 stars

Top 74.9% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a framework for building AI-powered conversational avatars, targeting developers and metaverse enthusiasts. It enables rapid creation of interactive avatars capable of real-time voice conversation, facial expressions, and platform integration, significantly reducing development time for complex AI-driven characters.

How It Works

AIAvatarKit orchestrates multiple AI services, including LLMs (ChatGPT, Claude, Gemini, Dify), Text-to-Speech (VOICEVOX, Azure, Google, OpenAI), and Speech-to-Text (Azure, Google, OpenAI). It uses a modular design, allowing users to swap components and integrate custom behaviors. The framework handles audio input/output, speech processing, LLM interaction, and output synthesis (voice, facial expressions via OSC for VRChat), creating a cohesive conversational experience.

Quick Start & Requirements

  • Install: pip install git+https://github.com/uezo/aiavatarkit.git@v0.6.5
  • Requirements: Python 3.10+, VOICEVOX API, OpenAI API key (for ChatGPT and Speech-to-Text).
  • Quick Start: See README for example run.py and setup.

Highlighted Details

  • Supports multiple LLMs (ChatGPT, Claude, Gemini, Dify) and TTS/STT services via LiteLLM and custom implementations.
  • Enables real-time facial expression control for VRChat using Avatar OSC.
  • Features include vision processing (screenshots/camera input for AI), function calling, long-term memory, and custom behavior hooks.
  • Offers RESTful API and WebSocket server options for distributed deployment and client interaction.

Maintenance & Community

The project is actively maintained by uezo. Further community engagement details (Discord/Slack, roadmap) are not explicitly detailed in the README.

Licensing & Compatibility

The project is licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The README notes that PyPI versions may lag behind GitHub releases during transition periods. Some features like "Animation" and "Raspberry Pi" deployment are marked as "Now writing..." indicating incomplete implementation. Support for Claude and Gemini is limited to specific APIs (not Amazon Bedrock or Vertex AI) unless using API proxies like LiteLLM.

Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
14
Issues (30d)
3
Star History
68 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems).

LangBot by langbot-app

0.9%
13k
IM bot platform for the LLM era
created 2 years ago
updated 5 days ago
Feedback? Help us improve.