Discover and explore top open-source AI tools and projects—updated daily.
neonbjbSpeech recognition model built on PyTorch
Top 99.1% on SourcePulse
Ocotillo provides a performant and user-friendly speech recognition solution, targeting developers and researchers who need accurate English transcription. It simplifies the process of integrating state-of-the-art speech-to-text capabilities into applications, offering significant speed improvements through TorchScript optimization.
How It Works
Ocotillo leverages a wav2vec2 model, specifically jbetker/wav2vec2-large-robust-ft-libritts-voxpopuli, fine-tuned for speech recognition and punctuation prediction. The core advantage lies in its TorchScript tracing, which compiles the PyTorch model into C++ for highly efficient inference. This approach minimizes overhead and maximizes processing speed, especially on GPUs.
Quick Start & Requirements
git clone https://github.com/neonbjb/ocotillo.git
cd ocotillo
python setup.py install
asr_demo.ipynbHighlighted Details
transcribe.py), and a Python API (Transcriber class).Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is focused on English speech recognition and has not been tested on embedded hardware like the Raspberry Pi. The licensing status requires clarification for commercial adoption.
3 years ago
Inactive
janhq
m-bain
openai