obs-localvocal  by royshil

OBS plugin for local speech recognition and captioning

created 2 years ago
843 stars

Top 43.1% on sourcepulse

GitHubView on GitHub
Project Summary

LocalVocal is an OBS Studio plugin that provides real-time, local speech-to-text transcription and translation using AI models. It targets streamers, content creators, and accessibility users who need on-device captioning and translation without relying on cloud services, ensuring privacy and eliminating ongoing costs.

How It Works

The plugin leverages Whisper.cpp for efficient, CPU-based (with optional GPU acceleration via CUDA, ROCm, Vulkan, or Metal) processing of audio into text. Translation is handled by CTranslate2. This approach allows for high-performance, local operation, supporting a wide range of languages and offering flexibility in model selection, including custom GGML models.

Quick Start & Requirements

  • Installation: Download pre-built releases for Windows, macOS, or Linux from the releases page. For building from source, follow platform-specific build scripts (.github/scripts/build-macos, .github/scripts/build-linux, Build-Windows.ps1).
  • Prerequisites:
    • OBS Studio installed.
    • For GPU acceleration: NVIDIA GPU with CUDA drivers, AMD GPU with ROCm, or Vulkan-compatible GPU.
    • macOS requires specific builds for Intel or Apple Silicon.
    • Linux requires libssl-dev.
  • Resources: The plugin ships with the tiny.en Whisper model; larger models can be downloaded. Performance depends heavily on CPU/GPU capabilities.
  • Links: Releases, Usage Tutorials.

Highlighted Details

  • Supports real-time transcription in 100 languages and translation to major languages.
  • Outputs captions to screen, text files, or directly to RTMP streams.
  • Offers various acceleration options: CUDA, hipBLAS (AMD ROCm), Apple Arm64, Vulkan, AVX, SSE.
  • Allows users to bring their own GGML Whisper models.

Maintenance & Community

  • Actively maintained with regular releases.
  • Community support channels are not explicitly mentioned in the README.

Licensing & Compatibility

  • Licensed under the MIT License.
  • Permits commercial use and integration with closed-source applications.

Limitations & Caveats

  • AMD ROCm and Vulkan acceleration are noted as experimental.
  • Building on Linux for non-Ubuntu distributions may require manual dependency management and CMake configuration.
  • The README mentions potential folder name mismatches when packaging macOS builds.
Health Check
Last commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
3
Star History
85 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.