DeepSpeech  by mozilla

Open-source speech-to-text engine for on-device inference

created 9 years ago
26,541 stars

Top 1.5% on sourcepulse

GitHubView on GitHub
Project Summary

DeepSpeech is an open-source, embedded speech-to-text engine designed for real-time, offline operation on a wide range of hardware, from Raspberry Pi to high-power GPUs. It targets developers and researchers needing on-device transcription capabilities.

How It Works

The engine utilizes a machine learning model based on Baidu's Deep Speech research paper, implemented using Google's TensorFlow. This approach allows for efficient, on-device processing without requiring cloud connectivity.

Quick Start & Requirements

Highlighted Details

  • Supports on-device, real-time transcription.
  • Trained using machine learning techniques.
  • Implemented with TensorFlow for easier development.

Maintenance & Community

Licensing & Compatibility

  • License: MPL 2.0.
  • Compatibility: Permissive license suitable for commercial and closed-source applications.

Limitations & Caveats

The project's README does not detail specific performance benchmarks or known limitations regarding accuracy across different languages or accents.

Health Check
Last commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
340 stars in the last 90 days

Explore Similar Projects

Starred by Michael Han Michael Han(Cofounder of Unsloth), Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), and
7 more.

TTS by coqui-ai

0.4%
42k
Deep learning toolkit for Text-to-Speech, research-tested
created 5 years ago
updated 11 months ago
Feedback? Help us improve.