izwi  by izwi-ai

Local-first audio AI engine for private, low-latency workflows

Created 2 months ago
269 stars

Top 95.6% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Izwi provides a local-first audio AI engine enabling private, on-device transcription, Text-to-Speech (TTS), and voice AI workflows. It targets users who require data privacy, low latency, and independence from cloud services, offering an OpenAI-compatible API for seamless integration.

How It Works

This platform functions as a self-hosted inference engine, running all core audio AI tasks directly on user hardware. Its architecture prioritizes privacy by ensuring no data leaves the local machine. Key features include real-time voice interaction, robust long-form ASR with automatic chunking and overlap handling, and advanced voice cloning/design capabilities, all accessible via a familiar OpenAI-compatible API.

Quick Start & Requirements

Installation is straightforward across macOS (via .dmg), Linux (using dpkg), and Windows (installer). The core command to start the server is izwi serve. Users can then download models using izwi pull and execute tasks like TTS (izwi tts) or transcription (izwi transcribe). Detailed installation and getting started guides are available at izwiai.com/docs/installation and izwiai.com/docs/getting-started.

Highlighted Details

  • Comprehensive audio AI suite: Supports TTS, ASR, Speaker Diarization, Voice Cloning, Voice Design, and Chat functionalities.
  • OpenAI-compatible API: Facilitates easy integration into existing applications and workflows.
  • Advanced ASR: Handles long audio recordings automatically by chunking and stitching transcripts, with configurable parameters.
  • Extensive model support: Includes models like Qwen3-TTS, Whisper-Large-v3-Turbo, Parakeet-TDT, and various LLMs for chat.

Maintenance & Community

The project acknowledges contributions from organizations like Alibaba (Qwen3-TTS), NVIDIA (Parakeet), and Google (Gemma). Specific details regarding active maintenance, community channels (e.g., Discord, Slack), or a public roadmap are not provided in the README.

Licensing & Compatibility

Izwi is licensed under the Apache 2.0 license. This permissive license generally allows for commercial use and integration into closed-source projects without significant restrictions.

Limitations & Caveats

As a local-first engine, performance and resource utilization are directly dependent on the user's hardware capabilities. The README does not detail specific hardware requirements or known limitations regarding unsupported platforms or specific model performance ceilings.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
43
Issues (30d)
6
Star History
118 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.