sherpa  by k2-fsa

Speech-to-text server framework with next-gen Kaldi

Created 3 years ago
882 stars

Top 40.6% on SourcePulse

GitHubView on GitHub
Project Summary

Sherpa is an open-source speech-to-text inference framework designed for efficient deployment of end-to-end (E2E) models, specifically transducer and CTC-based architectures. It targets developers and researchers needing to integrate pre-trained speech recognition models into applications, offering both C++ and Python APIs for flexibility.

How It Works

Sherpa leverages PyTorch for its core inference engine, focusing on optimized deployment of E2E models. This approach allows for streamlined integration of advanced speech recognition capabilities directly into applications, bypassing the complexities of model training pipelines.

Quick Start & Requirements

Highlighted Details

  • Supports both transducer and CTC-based end-to-end models.
  • Provides both C++ and Python APIs for broad integration.
  • Focuses exclusively on inference and deployment of pre-trained models.
  • Offers alternative implementations (sherpa-onnx, sherpa-ncnn) for mobile and embedded systems.

Maintenance & Community

  • Project is hosted by k2-fsa.
  • Further community and roadmap details are available via the official documentation.

Licensing & Compatibility

  • License details are not explicitly stated in the provided README snippet. Further investigation into the repository is recommended for licensing specifics and commercial use compatibility.

Limitations & Caveats

The project explicitly states it is not for model training or fine-tuning; users interested in those aspects should refer to the icefall project.

Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
9
Issues (30d)
0
Star History
17 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

moonshine by moonshine-ai

9.0%
4k
Speech-to-text models optimized for fast, accurate ASR on edge devices
Created 1 year ago
Updated 2 days ago
Feedback? Help us improve.