transcriber_app by davabase

Real-time speech-to-text transcription app

Created 3 years ago

432 stars

Top 68.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Georgi Gerganov

Author of llama.cpp, whisper.cpp

Project Summary

This project provides a real-time speech-to-text transcription application designed for users needing live transcription, such as content creators, journalists, or individuals requiring accessibility tools. It leverages OpenAI's Whisper model for accurate transcription and Flet for a cross-platform GUI.

How It Works

The application utilizes Flet to build a user-friendly interface, allowing users to select audio input sources and control transcription parameters. It integrates with OpenAI's Whisper model, offering various model sizes (e.g., Tiny, Base, Small, Medium, Large) to balance performance and accuracy. The app supports translation to English and allows customization of transcription behavior through settings like transcribe_rate, seconds_of_silence_between_lines, and max_record_time.

Quick Start & Requirements

Install: pip install -r requirements.txt (after activating a Python 3.7 virtual environment).
Prerequisites: Python 3.7, Flet, OpenAI Whisper.
Building: Uses cx_Freeze for creating executables.
Docs: https://flet.dev/, https://github.com/openai/whisper

Highlighted Details

Real-time transcription with adjustable parameters for silence detection and recording duration.
Supports translation from various languages to English.
Configurable audio models (Tiny to Large) for performance tuning.
GUI window transparency and custom text backgrounds for overlay use.
Saves transcriptions to transcription.txt and settings to transcriber_settings.yaml.

Maintenance & Community

The project is maintained by davabase. No specific community channels or roadmap are detailed in the README.

Licensing & Compatibility

The code is public domain, allowing for unrestricted use, modification, and distribution, including commercial applications.

Limitations & Caveats

The setup specifies Python 3.7, which is end-of-life. Building executables uses cx_Freeze due to reported compatibility issues between PyInstaller and PyTorch. Performance is dependent on the chosen Whisper model and system resources.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days