Unity3d bindings for local speech-to-text inference
Top 55.8% on sourcepulse
This project provides Unity3D bindings for whisper.cpp
, enabling local, offline speech-to-text inference within Unity applications. It targets game developers and researchers seeking to integrate advanced ASR capabilities directly into their projects without relying on cloud services. The primary benefit is high-performance, multilingual transcription and translation running entirely on the user's device.
How It Works
The project leverages whisper.cpp
, a C++ implementation of OpenAI's Whisper model, optimized for performance. It uses GGML for efficient CPU and GPU (Vulkan/Metal) inference. The Unity bindings expose the whisper.cpp
functionality through C# scripts, allowing developers to load models, process audio streams from microphones or files, and receive transcribed text or translations. This approach minimizes latency and ensures data privacy by keeping processing local.
Quick Start & Requirements
https://github.com/Macoron/whisper.unity.git?path=/Packages/com.whisper.unity
ggml-tiny.bin
model included; larger models can be downloaded and placed in StreamingAssets
.Highlighted Details
Maintenance & Community
The project is maintained by Macoron. Further community engagement details (Discord, Slack, roadmap) are not explicitly provided in the README.
Licensing & Compatibility
Licensed under the MIT License. This license permits commercial use and integration into closed-source projects. The underlying whisper.cpp
and OpenAI Whisper code/weights are also MIT licensed.
Limitations & Caveats
WebGL platform is not currently supported. CUDA acceleration is deprecated in favor of Vulkan; users requiring CUDA must use older releases. Metal support requires Apple Silicon (M1 or newer) for optimal performance, falling back to CPU on older hardware.
3 months ago
1 day