DeepSpeech by mozilla

Open-source speech-to-text engine for on-device inference

Created 10 years ago

26,766 stars

Top 1.8% on SourcePulse

15 Experts Love This Project

joewalnes

Head of Experimental Projects at Stripe

omarsar

Founder of DAIR.AI

simonw

Coauthor of Django

maxbrunsfeld

Cofounder of Zed

and 11 more!

Project Summary

DeepSpeech is an open-source, embedded speech-to-text engine designed for real-time, offline operation on a wide range of hardware, from Raspberry Pi to high-power GPUs. It targets developers and researchers needing on-device transcription capabilities.

How It Works

The engine utilizes a machine learning model based on Baidu's Deep Speech research paper, implemented using Google's TensorFlow. This approach allows for efficient, on-device processing without requiring cloud connectivity.

Quick Start & Requirements

Installation: Typically via pip or Docker.
Prerequisites: TensorFlow, Python. Specific hardware requirements depend on model size and performance needs.
Documentation: https://deepspeech.readthedocs.io/
Latest Release: https://github.com/mozilla/DeepSpeech/releases/latest

Highlighted Details

Supports on-device, real-time transcription.
Trained using machine learning techniques.
Implemented with TensorFlow for easier development.

Maintenance & Community

Project is maintained by Mozilla.
Contribution guidelines are available: CONTRIBUTING.rst
Support information: SUPPORT.rst

Licensing & Compatibility

License: MPL 2.0.
Compatibility: Permissive license suitable for commercial and closed-source applications.

Limitations & Caveats

The project's README does not detail specific performance benchmarks or known limitations regarding accuracy across different languages or accents.

Health Check

Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

39 stars in the last 30 days

Explore Similar Projects

Starred by

Patrick von Platen

Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral).

Squeezeformer by kssteven418

Speech recognition model based on an efficient Transformer architecture

Created 4 years ago

Updated 3 years ago

deepspeech-german by AASHISHAG

ASR module using Mozilla DeepSpeech for German speech

Created 7 years ago

Updated 3 years ago

ltu by YuanGongND

Audio/speech LLM for perception and understanding, supporting open-ended questions

Created 3 years ago

Updated 2 years ago

edgedict by theblackcat102

RNN-Transducer for online speech recognition

Created 6 years ago

Updated 5 years ago

VITA-Audio by VITA-MLLM

Speech model for fast audio-text token generation

Created 1 year ago

Updated 1 year ago

SLAM-LLM by X-LANCE

MLLM toolkit for speech, language, audio, and music processing

Created 2 years ago

Updated 5 months ago

athena by athena-team

Open-source speech processing engine for industrial/academic use

Created 6 years ago

Updated 3 years ago

openWakeWord by dscripka

Open-source wakeword detection library for voice-enabled apps

Created 4 years ago

Updated 6 months ago

Starred by

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n),

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera), and

1 more.

moonshine by moonshine-ai

Speech-to-text models optimized for fast, accurate ASR on edge devices

Created 1 year ago

Updated 21 hours ago

Starred by

Chaoyu Yang

Chaoyu Yang(Founder of Bento),

Cristóbal Valenzuela

Cristóbal Valenzuela(Cofounder of Runway), and

8 more.

speech-to-text-wavenet by buriburisuri

Speech recognition using WaveNet in TensorFlow

Created 9 years ago

Updated 4 years ago

Starred by

Travis Fischer

Travis Fischer(Founder of Agentic),

Andrew Kane

Andrew Kane(Author of pgvector), and

1 more.

TTS by mozilla

Deep learning library for text-to-speech generation

Created 8 years ago

Updated 2 years ago

Starred by

Luis Capelo

Luis Capelo(Cofounder of Lightning AI).

FunASR by modelscope

Speech recognition toolkit for bridging research and industrial applications

Created 3 years ago

Updated 1 day ago

Feedback? Help us improve.