whisperIMEplus  by woheller69

Android IME for offline voice recognition and translation

Created 8 months ago
260 stars

Top 97.6% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an Android Input Method Editor (IME) and voice input service leveraging the Whisper speech-to-text engine. It targets Android users seeking enhanced, privacy-preserving voice input capabilities, offering offline recognition and seamless integration with existing Android features and other applications. The primary benefit is a robust, locally processed voice recognition solution that bypasses cloud-based services.

How It Works

Whisper+ utilizes the Whisper engine for its core voice recognition functionality, enabling entirely offline processing. It operates as an IME, allowing users to dictate text directly into applications, and can also be configured as a system-wide RecognitionService. A unique feature is its standalone mode, which can translate supported languages directly to English, enhancing its utility beyond simple dictation.

Quick Start & Requirements

Initial setup involves downloading the Whisper model from Hugging Face. For system-wide voice input, users may need to enable USB debugging and execute an ADB command: adb shell settings put secure voice_recognition_service org.woheller69.whisperplus/com.whisperonnx.WhisperRecognitionService. Voice recognition operates offline, ensuring privacy.

Highlighted Details

  • Offline voice recognition powered by the Whisper engine.
  • Functions as both an Android IME and a system-wide RecognitionService.
  • Standalone app capability for translating supported languages to English.
  • Supports voice input calls via Android intents (RecognizerIntent.ACTION_RECOGNIZE_SPEECH).
  • Allows predefining two languages for quick switching.

Maintenance & Community

No specific details regarding project maintainers, community channels (like Discord/Slack), or roadmap are provided in the README.

Licensing & Compatibility

The project is licensed under GPLv3. It incorporates code and models from various sources, including MIT-licensed projects (WhisperIME, Whisper ONNX models, Whisper-Android, OpenAI Whisper, Hugging Face models) and Apache-2.0 (Opencc4j). The GPLv3 license imposes copyleft restrictions, which may affect compatibility with closed-source applications or commercial use.

Limitations & Caveats

A critical limitation is that the application will cease functioning on certified Android devices starting in 2026/2027 due to Google's new developer identity submission requirements, which the developers oppose. Each recording is limited to a maximum of 30 seconds.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
4
Star History
51 stars in the last 30 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
1 more.

moonshine by moonshine-ai

2.9%
7k
Speech-to-text models optimized for fast, accurate ASR on edge devices
Created 1 year ago
Updated 3 days ago
Feedback? Help us improve.