whisper_android  by vilassn

Android app for offline speech recognition using OpenAI Whisper

created 1 year ago
496 stars

Top 63.4% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides two Android applications for offline speech recognition using OpenAI's Whisper model via TensorFlow Lite. It targets Android developers seeking to integrate robust ASR capabilities directly into their applications, offering both Java and native C++ interfaces for flexibility and performance.

How It Works

The project leverages TensorFlow Lite to run quantized Whisper models on Android devices. It offers two implementations: one using the TensorFlow Lite Java API for straightforward integration within Java-based Android projects, and another using the TensorFlow Lite Native API (C++) for potentially higher performance and lower overhead. A Python script is included to convert Whisper models to the TFLite format.

Quick Start & Requirements

  • Install: Open whisper_java or whisper_native folders in Android Studio and build/run on a device or emulator.
  • Prerequisites: Android device/emulator, Android Studio.
  • Resources: Pre-built APKs are available in demo_and_apk.
  • Docs: Integration guide and code snippets are provided within the README.

Highlighted Details

  • Offline speech-to-text using OpenAI Whisper.
  • Two Android app implementations: TensorFlow Lite Java API and Native API.
  • Python script for Whisper model conversion to TFLite.
  • Includes pre-built APKs for easy deployment.

Maintenance & Community

  • Project maintained by vilassn.
  • Contact via email (vilassninawe@gmail.com) for inquiries.
  • PayPal link provided for project support.

Licensing & Compatibility

  • License not explicitly stated in the README.
  • Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The README does not specify the exact Whisper model sizes supported or provide performance benchmarks. It also lacks explicit licensing information, which may impact commercial adoption. Users need to ensure correct audio file formats (16K, mono, 16bits) for transcription.

Health Check
Last commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
75 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.