Discover and explore top open-source AI tools and projects—updated daily.
Real-time silent speech recognition tool
Top 59.1% on SourcePulse
Chaplin is a real-time, fully local visual speech recognition (VSR) tool that transcribes silently mouthed words by reading lips. It is designed for users interested in silent communication or exploring advanced VSR technologies.
How It Works
Chaplin utilizes a pre-trained model from the Auto-AVSR project, specifically trained on the Lip Reading Sentences 3 (LRS3) dataset. It employs the MediaPipe framework for lip detection and integrates with Ollama for language modeling, enabling real-time transcription of lip movements.
Quick Start & Requirements
llama3.2
model.uv
(Python package manager).sudo uv run --with-requirements requirements.txt --python 3.12 main.py config_filename=./configs/LRS3_V_WER19.1.ini detector=mediapipe
LRS3_V_WER19.1
and lm_en_subword
model components in the specified directory structure.Highlighted Details
Maintenance & Community
No specific community channels or maintenance details are provided in the README.
Licensing & Compatibility
The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project requires specific model downloads and manual placement. The use of sudo
for the run command suggests potential permission issues or system-level integration. No performance benchmarks or accuracy metrics are provided.
7 months ago
Inactive