Android-MVVM-Architecture-Android-Voice-AI-SDK  by ahmedeltaher

Voice AI conversation pipeline for Android apps

Created 9 years ago
2,557 stars

Top 17.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This Android library provides a reusable, voice-driven AI conversation pipeline, enabling developers to quickly integrate voice assistant capabilities into their applications. Targeting Android developers, it simplifies the integration of speech-to-text, AI understanding via Anthropic Claude, and text-to-speech, all within an MVVM architecture and featuring Jetpack Compose UI components.

How It Works

The SDK orchestrates a multi-stage pipeline: capturing microphone audio, performing Voice Activity Detection (VAD), transcribing speech to text (STT) using pluggable engines, sending transcripts to Anthropic Claude for AI-driven responses, and converting replies back to speech (TTS). This process is managed by a VoiceAISession and configured via a builder pattern, offering flexibility with swappable STT/TTS engines and on-device emotion detection.

Quick Start & Requirements

  • Installation: Add com.sdk:voice-ai-sdk:1.0.0 to build.gradle.kts.
  • Permissions: Declare RECORD_AUDIO, INTERNET, ACCESS_NETWORK_STATE in AndroidManifest.xml.
  • API Key: Store Anthropic API key in local.properties (e.g., ANTHROPIC_API_KEY=sk-ant-...) and expose via BuildConfig.
  • Hilt: Annotate Application with @HiltAndroidApp and Activity with @AndroidEntryPoint if using Hilt.
  • Prerequisites: Android Studio Meerkat+, Minimum SDK 24 (Android 7.0), Kotlin 2.0+ (project uses 2.3.21), Anthropic API key.

Highlighted Details

  • Swappable Engines: Supports pluggable STT (Android built-in, Whisper) and TTS (Android built-in, ElevenLabs) engines.
  • On-Device Emotion Detection: Analyzes audio for emotions and can inject this into the AI context for adaptive responses.
  • Security Features: Includes PII redaction, encrypted key storage via Android Keystore, and optional certificate pinning.
  • UI Components: Provides ready-to-use Jetpack Compose UI elements for voice interaction.

Maintenance & Community

No specific details on contributors, sponsorships, or community channels are provided in the README.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive MIT license allows for commercial use and integration into closed-source projects.

Limitations & Caveats

  • External Dependencies: Core functionality relies on external APIs (Anthropic Claude, optionally Whisper, ElevenLabs), requiring API keys and network connectivity.
  • Android Specific: Designed exclusively for the Android platform.
Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
0
Star History
6 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.