z-waif  by SugarcaneDefender

Local AI companion program for personal use

Created 1 year ago
389 stars

Top 73.8% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a fully local, open-source framework for creating personalized AI companions, often referred to as "waifus." It targets users interested in virtual companionship, VTubing, and integrating AI into their digital lives, offering a cohesive front-end to manage various AI models and modules for enhanced interaction and customization.

How It Works

z-waif acts as an orchestrator, integrating multiple open-source AI tools like Oobabooga for language models, RVC for voice cloning, and Whisper for speech-to-text. It leverages a custom Retrieval-Augmented Generation (RAG) system for long-term memory, allowing the AI to recall past conversations and lorebook entries. The architecture emphasizes modularity, enabling integration with platforms like VTube Studio, Discord, and Minecraft, with a focus on local execution for privacy and control.

Quick Start & Requirements

  • Installation: Follow provided documentation and YouTube tutorials.
  • Prerequisites:
    • Windows 10/11 recommended. Mac and Linux are supported.
    • NVIDIA GPU with 16GB+ VRAM recommended.
    • Any brand GPU with 8GB+ VRAM is the minimum requirement.
    • Requires Oobabooga, RVC, and Whisper.
  • Resources: Setup time and resource footprint depend on the specific AI models used.
  • Links: YouTube Showcase, Documentation, Website, Discord

Highlighted Details

  • VTuber Integration: Supports VTube Studio with emote and animation synchronization.
  • Enhanced Memory: Features custom RAG for long-term conversation recall and lorebook integration.
  • Modularity: Includes modules for Discord, multimodal vision, alarms, and game integration (Minecraft via Baritone/Wurst).
  • Local Execution: All AI processing is performed locally for enhanced privacy and control.

Maintenance & Community

The project is actively maintained, with recent updates (v1.11) including Ollama image model options and bug fixes. There are community forks offering additional features like Twitch integration and Russian language support. A Discord server is available for community support.

Licensing & Compatibility

The project is open-source, with the primary license not explicitly stated in the README. However, its reliance on other open-source tools suggests a permissive or copyleft licensing model. Compatibility with commercial or closed-source applications would depend on the licenses of its constituent dependencies.

Limitations & Caveats

The project is described as being in an "early access state," with potential for mild bugs, jankiness, or obtuse elements. Some advanced features, like multiprocessed RAG and streaming LLM/TTS, are still under development.

Health Check
Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
44 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.