ChatdollKit  by uezo

3D virtual assistant SDK for voice-enabled chatbots using 3D models

created 5 years ago
965 stars

Top 39.0% on sourcepulse

GitHubView on GitHub
Project Summary

ChatdollKit is a Unity-based SDK for creating voice-enabled 3D chatbot avatars. It targets developers and creators looking to integrate generative AI, 3D model animation, and speech technologies into interactive virtual agents for platforms like PC, mobile, VR, AR, and WebGL. The SDK aims to simplify the complex process of building expressive and responsive AI characters.

How It Works

ChatdollKit orchestrates interactions between Large Language Models (LLMs) for dialogue, Speech-to-Text (STT) for input, and Text-to-Speech (TTS) for output. It synchronizes these with 3D model animations and facial expressions, allowing AI-driven characters to respond dynamically to user input. Key features include LLM integration (ChatGPT, Gemini, Claude), various TTS/STT providers, and robust 3D model control for lip-sync, facial expressions, and animations, all managed within the Unity engine.

Quick Start & Requirements

  • Installation: Import ChatdollKit.unitypackage into a Unity project.
  • Prerequisites: Unity (non-SRP project template), Burst, UniTask (v2.5.4+), uLipSync (v3.1.0+), UniVRM (v0.127.2+), JSON.NET, and optionally Azure Speech SDK.
  • Setup: Requires importing dependencies, adding AIAvatarVRM prefab, configuring LLM/Speech services with API keys, and setting up animations via ModelController.
  • Demo: A WebGL demo is available. A YouTube video guides through setting up the demo scene with ChatGPT.
  • Docs: Comprehensive setup and feature documentation is provided within the README.

Highlighted Details

  • Supports multiple LLMs (ChatGPT, Gemini, Claude, Dify) with function calling and multimodal capabilities.
  • Enables dynamic 3D model expression, including lip-sync, facial expressions, and animations synchronized with speech.
  • Offers extensive platform compatibility (Windows, Mac, Linux, iOS, Android, WebGL, VR, AR).
  • Features like dynamic language switching, long-term memory integration, and wake word detection enhance user interaction.

Maintenance & Community

The project is actively maintained by uezo. Community links are not explicitly provided in the README, but the project structure suggests a focus on developer integration.

Licensing & Compatibility

The project's license is not explicitly stated in the provided README text. Compatibility for commercial use would depend on the specific license terms.

Limitations & Caveats

  • Unity SRP projects are not supported due to UniVRM limitations.
  • WebGL builds have specific requirements, including using UniTask for async/await and ChatdollMicrophone for microphone input, and do not support compressed audio formats.
  • Some features like microphone control in WebGL or specific STT/TTS integrations might require platform-specific adjustments.
Health Check
Last commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
11
Issues (30d)
1
Star History
77 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.