vits_with_chatgpt-gpt3 by Paraworks

Chatbot with text-to-speech using VITS and optional ChatGPT/ChatGLM

Created 3 years ago

387 stars

Top 74.1% on SourcePulse

Project Summary

This repository provides a text-to-speech (TTS) system leveraging VITS and integrates with large language models like GPT-3.5/GPT-3 and ChatGLM for conversational AI applications. It targets developers and researchers building interactive voice agents or chatbots.

How It Works

The system utilizes the VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) model for high-quality speech synthesis. It integrates with external LLMs via API calls to generate responses, which are then fed into the VITS model for speech output. The architecture supports custom chat servers and offers a flexible configuration for different LLM backends and speech processing pipelines.

Quick Start & Requirements

Installation: Clone the repository and install dependencies using pip install -r requirements.txt.
Prerequisites: Python 3.8+, Anaconda, Git, FFmpeg. Optional: CUDA for GPU acceleration, pyopenjtalk for Japanese speech synthesis (requires CMake).
Setup: The README outlines detailed setup steps for Linux and Windows, including environment creation and dependency installation.
Links: Hugging Face Repo

Highlighted Details

Supports multiple LLM backends: GPT-3.5/GPT-3 API, ChatGLM.
Offers a web UI for chatbot configuration and VITS model loading.
Includes an ONNX export tool for the VITS model.
Provides guidance on handling Japanese text processing for VITS.

Maintenance & Community

Information regarding maintainers, community channels, or roadmaps is not explicitly detailed in the provided README.

Licensing & Compatibility

The repository's licensing is not specified in the README. Compatibility for commercial use or closed-source linking would depend on the underlying licenses of VITS and the LLM APIs used.

Limitations & Caveats

The README notes that using pyopenjtalk for Japanese synthesis may yield suboptimal results, suggesting an alternative cleaner. It also warns about potential issues with specific dependency versions (e.g., protobuf, transformers) when using ChatGLM. The project's status (e.g., alpha, beta) and long-term maintenance are not clear.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days