whisperIMEplus by woheller69

Android IME for offline voice recognition and translation

Created 11 months ago

338 stars

Top 81.6% on SourcePulse

Project Summary

This project provides an Android Input Method Editor (IME) and voice input service leveraging the Whisper speech-to-text engine. It targets Android users seeking enhanced, privacy-preserving voice input capabilities, offering offline recognition and seamless integration with existing Android features and other applications. The primary benefit is a robust, locally processed voice recognition solution that bypasses cloud-based services.

How It Works

Whisper+ utilizes the Whisper engine for its core voice recognition functionality, enabling entirely offline processing. It operates as an IME, allowing users to dictate text directly into applications, and can also be configured as a system-wide RecognitionService. A unique feature is its standalone mode, which can translate supported languages directly to English, enhancing its utility beyond simple dictation.

Quick Start & Requirements

Initial setup involves downloading the Whisper model from Hugging Face. For system-wide voice input, users may need to enable USB debugging and execute an ADB command: adb shell settings put secure voice_recognition_service org.woheller69.whisperplus/com.whisperonnx.WhisperRecognitionService. Voice recognition operates offline, ensuring privacy.

Highlighted Details

Offline voice recognition powered by the Whisper engine.
Functions as both an Android IME and a system-wide RecognitionService.
Standalone app capability for translating supported languages to English.
Supports voice input calls via Android intents (RecognizerIntent.ACTION_RECOGNIZE_SPEECH).
Allows predefining two languages for quick switching.

Maintenance & Community

No specific details regarding project maintainers, community channels (like Discord/Slack), or roadmap are provided in the README.

Licensing & Compatibility

The project is licensed under GPLv3. It incorporates code and models from various sources, including MIT-licensed projects (WhisperIME, Whisper ONNX models, Whisper-Android, OpenAI Whisper, Hugging Face models) and Apache-2.0 (Opencc4j). The GPLv3 license imposes copyleft restrictions, which may affect compatibility with closed-source applications or commercial use.

Limitations & Caveats

A critical limitation is that the application will cease functioning on certified Android devices starting in 2026/2027 due to Google's new developer identity submission requirements, which the developers oppose. Each recording is limited to a maximum of 30 seconds.

whisperIMEplus by woheller69

Explore Similar Projects

voice-input by futo-org

typeflux by mylxsw

Transcribro by soupslurpr

whisperIME by woheller69

macparakeet by moona3k

pindrop by watzon

voxt by hehehai

ollama-voice-mac by apeatling

FluidVoice by altic-dev

VoiceInk by Beingpax

RTranslator by niedev

sherpa-onnx by k2-fsa