AIwaifu by HRNPH

Open-source project for creating a customizable AI waifu

Created 2 years ago

501 stars

Top 62.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Teknium

Cofounder of Nous Research

Project Summary

AI Waifu provides an open-source, customizable AI companion inspired by Neuro-sama, targeting users who want to build and deploy their own AI waifu. It offers a finetunable, talkable, and streamable experience, with the ability to be modified and even "lewdable," all self-hosted.

How It Works

The project integrates multiple AI models for its functionality. A language model (Pygmalion1.3b) handles conversational aspects, while a VITS-based TTS model provides Japanese speech output. English responses are translated to Japanese using Facebook's NLLB-600M model. Voice conversion is handled by Sovits, requiring a compiled monotonic_align module. The architecture splits inference into a separate HTTP server, allowing for distributed or home-server deployment.

Quick Start & Requirements

Installation: Clone the repository, install Poetry, and run poetry install. Compile the monotonic_align module separately.
Prerequisites: Python 3.8.x, C/C++ build tools, CMake, Git LFS.
Runtime: Minimum 12GB RAM (16GB recommended) for the inference server. For GPU inference, a minimum of 8GB VRAM (Nvidia GPU only) is required.
Integration: Requires VTube Studio and its Lua Lucky desktop audio plugin, configured to a specific API port.
Links: GitHub Repository

Highlighted Details

Finetunable and customizable AI waifu.
Supports talkable, flirtable, streamable, and modifiable interactions.
Leverages open-source models, explicitly avoiding proprietary ones like ChatGPT.
Japanese TTS output is a deliberate choice for "cuteness."

Maintenance & Community

The project encourages community contributions through issues and pull requests. Discussions on model performance are hosted on GitHub.

Licensing & Compatibility

The project states "Everything We Made Is OpenSourced, Free & Customizable To the Very Core." No specific license is explicitly mentioned in the README, but the emphasis on open-source suggests a permissive license. Compatibility with commercial or closed-source applications is not detailed.

Limitations & Caveats

The project is described as potentially having unstable components ("Sometime shit can be broke"). GPU inference is limited to Nvidia hardware. The TTS model is currently Japanese-only, with English translations performed by a separate model.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days