whisper.net  by sandrohanea

.NET library for speech-to-text using Whisper models

created 2 years ago
773 stars

Top 46.1% on sourcepulse

GitHubView on GitHub
Project Summary

Whisper.net provides .NET bindings for OpenAI's Whisper speech-to-text model, leveraging the efficient whisper.cpp backend. It targets .NET developers seeking to integrate powerful ASR capabilities into their applications, offering cross-platform support and hardware acceleration options.

How It Works

Whisper.net acts as a C# wrapper around the whisper.cpp library, enabling .NET applications to utilize Whisper models. It supports various hardware acceleration runtimes, including CPU (with and without AVX), NVIDIA CUDA, Apple CoreML, Intel OpenVino, and Vulkan. The library automatically selects the most appropriate runtime based on the host system's capabilities and user-defined priorities, facilitating seamless deployment across diverse hardware.

Quick Start & Requirements

  • Install via NuGet: Install-Package Whisper.net.AllRuntimes or add <PackageReference Include="Whisper.net.AllRuntimes" Version="1.8.1" /> to your .csproj.
  • Prerequisites:
    • Windows: Microsoft Visual C++ Redistributable (VS 2019+ x64).
    • Linux: libstdc++6, glibc 2.31.
    • macOS: TBD.
    • CPU Runtimes: AVX, AVX2, FMA, F16C instructions required for default CPU runtime; Whisper.net.Runtime.NoAvx available otherwise.
    • CUDA Runtime: NVIDIA GPU with CUDA support, CUDA Toolkit (>= 12.1).
    • OpenVino Runtime: OpenVino Toolkit (>= 2024.4).
    • Vulkan Runtime: Vulkan Toolkit (>= 1.3.290.0).
  • Documentation: https://github.com/sandrohanea/whisper.net/tree/main/docs

Highlighted Details

  • Supports multiple hardware acceleration runtimes (CUDA, CoreML, OpenVino, Vulkan) with automatic selection.
  • Integrates with NAudio for audio processing and supports channel diarization.
  • Provides a Hugging Face downloader for GGML models.
  • Offers a ChatGPT-based assistant for code-related queries.

Maintenance & Community

  • Active development with regular releases.
  • Community support via GitHub Issues.

Licensing & Compatibility

  • MIT License. Permissive for commercial use and closed-source linking.

Limitations & Caveats

The macOS prerequisites for native runtimes are listed as "TBD". The project's versioning scheme is tied to whisper.cpp commits, requiring users to check the submodule for specific backend versions.

Health Check
Last commit

1 week ago

Responsiveness

1 week

Pull Requests (30d)
3
Issues (30d)
2
Star History
60 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.