Neuro by kimjammer

Local AI Vtuber recreation of Neuro-Sama

Created 2 years ago

1,385 stars

Top 29.0% on SourcePulse

Project Summary

This project recreates the AI VTuber Neuro-Sama, enabling users to run a similar system on consumer hardware using local LLMs. It targets VTubers, streamers, and AI enthusiasts looking for an interactive, voice-driven AI companion with real-time speech processing and VTuber integration. The primary benefit is a highly customizable and locally-hosted AI personality.

How It Works

The system integrates real-time Speech-to-Text (STT) and Text-to-Speech (TTS) using the KoljaB/RealtimeSTT and KoljaB/RealtimeTTS libraries, respectively. It leverages an OpenAI-compatible API endpoint for LLM interaction, allowing flexibility in model choice (e.g., Llama 3 8B Instruct via text-generation-webui). Multimodality is supported via custom servers like Neuro-LLM-Server, enabling visual input processing. State and data are managed through a shared signals object, with modular components running in separate threads for extensibility.

Quick Start & Requirements

Install: Follow detailed instructions in the README for text-generation-webui, an LLM, and Vtube Studio.
Prerequisites: Nvidia GPU (12GB VRAM recommended), Python 3.11.9, PyTorch 2.2.2 with CUDA 11.8, pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu118, DeepSpeed (via AllTalkTTS wheels), Twitch developer account credentials, a voice reference WAV file.
Setup: Requires configuring .env and constants.py files, including audio device selection and API keys.
Links: DEMO VIDEO, neurofrontend

Highlighted Details

Real-time STT and TTS for natural voice interaction.
Flexible LLM integration with text-generation-webui or any OpenAI-compatible endpoint.
Long-term memory and RAG capabilities, with automatic memory generation.
Multimodal support via custom servers like Neuro-LLM-Server (e.g., MiniCPM-Llama3-V-2_5-int4).
VTuber integration with Vtube Studio via virtual audio cables for lip-sync.

Maintenance & Community

Project is experimental and educational.
Ko-fi tips are appreciated for support.
Links to community channels are not explicitly provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. It mentions "see LICENSE for the repository license," but no LICENSE file is present in the provided context.
Attribution is appreciated in derivative works.

Limitations & Caveats

The project is experimental and created for educational/recreational purposes, with no guarantee against "non-vile responses." Content filtering is minimal (currently only "turkey"). Twitch bans are possible for unsafe content. Discord integration attempts were deemed unusable due to platform limitations.

Neuro by kimjammer

Explore Similar Projects

Stream-Omni by ictnlp

llama-assistant by nrl-ai

z-waif by SugarcaneDefender

org-ai by rksm

jarvis by llm-guy

my-neuro by morettt

ai_virtual_mate_web by swordswind

gp.nvim by Robitx

Qwen3-Omni by QwenLM

py-gpt by szczyglis-dev

ultravox by fixie-ai

chatgpt-web-midjourney-proxy by Dooy