bark.cpp by PABannier

C/C++ library for fast, local text-to-speech generation using Suno AI's Bark model

Created 2 years ago

851 stars

Top 42.0% on SourcePulse

View on GitHub

5 Experts Love This Project

Tim J. Baek

Founder of Open WebUI

Cofounder of K-Scale Labs

and 1 more!

Project Summary

This project provides a pure C/C++ implementation of Suno AI's Bark text-to-speech model, targeting developers and researchers seeking efficient, real-time, and multilingual speech generation. It offers significant performance advantages through CPU/GPU backends, AVX instruction set support, and various quantization methods (4-bit, 5-bit, 8-bit integer, F16/F32 precision).

How It Works

The implementation leverages the GGML inference library for efficient computation, enabling CPU and GPU (CUDA, Metal) acceleration. It supports multiple quantization strategies to reduce memory footprint and improve inference speed, while preserving audio quality by not quantizing the codec model. The architecture is designed for minimal dependencies and cross-platform compatibility.

Quick Start & Requirements

Install: Clone the repository (git clone --recursive), update submodules, build with CMake (mkdir build && cd build && cmake .. && cmake --build . --config Release).
Dependencies: Python 3 for model conversion and weight downloading (pip install -r requirements.txt). NVIDIA GPU with CUDA for GPU acceleration.
Setup: Requires downloading and converting model weights.
Docs: Google Colab Demo

Highlighted Details

Pure C/C++ implementation with minimal dependencies.
Supports AVX, AVX2, AVX512 for x86 architectures.
Offers 4-bit, 5-bit, 8-bit integer, and F16/F32 precision quantization.
Compatible with Metal and CUDA backends for GPU acceleration.
Supports Bark Small and Bark Large models.

Maintenance & Community

The project is community-driven, welcoming contributions via bug reports, feature requests, and pull requests.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is under active development, with plans to implement support for additional models like AudioCraft and AudioLDM2. The README does not specify a license, which may impact commercial adoption.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

3 stars in the last 30 days