speakr  by murtaza-nasir

Self-hosted web app for audio transcription, summarization, and chat

Created 4 months ago
2,017 stars

Top 22.1% on SourcePulse

GitHubView on GitHub
Project Summary

Speakr is a self-hosted web application for transcribing audio recordings, generating summaries and titles, and interacting with the content via chat. It targets individuals and teams seeking to securely manage and analyze their audio data, offering a private alternative to cloud-based transcription services.

How It Works

Speakr leverages OpenAI-compatible APIs for both Speech-to-Text (STT) and Large Language Models (LLMs). Users upload audio files, which are processed in the background. STT APIs convert audio to text, and LLMs then generate summaries, titles, and provide conversational interaction based on the transcript. The architecture supports configurable transcription and output languages, user-specific prompts, and integration of user professional context for more relevant AI responses.

Quick Start & Requirements

  • Installation: Docker is the only currently functional installation method.
    • Clone the repository: git clone https://github.com/murtaza-nasir/speakr.git
    • Configure docker-compose.yml with API keys (OpenAI-compatible for STT and LLM) and desired models.
    • Start with docker compose up -d.
  • Prerequisites: Python 3.8+, pip, venv, Docker, and API keys for STT (e.g., Whisper) and LLM (e.g., OpenRouter, OpenAI).
  • Setup: Requires configuring API endpoints and keys.
  • Links: GitHub Repository

Highlighted Details

  • Supports multilingual transcription and AI output.
  • Offers interactive chat with transcript content.
  • Includes user authentication, account management, and an admin dashboard.
  • Metadata editing for recordings (titles, participants, dates, notes).
  • Customizable summarization prompts and user professional context for AI.

Maintenance & Community

The project is maintained by Murtaza Nasir. Feedback, bug reports, and feature suggestions are welcomed via GitHub Issues. A Contributor License Agreement (CLA) will be required for future code contributions.

Licensing & Compatibility

Dual-licensed under GNU Affero General Public License v3.0 (AGPLv3) and a separate commercial license. AGPLv3 requires sharing source code of modified versions if accessed over a network. Commercial licensing is available for proprietary integration.

Limitations & Caveats

Local development and Linux systemd deployment methods are explicitly stated as not currently working. Users must rely on the Docker installation. The AGPLv3 license has significant implications for commercial use, requiring source code disclosure of network-accessible modifications.

Health Check
Last Commit

18 hours ago

Responsiveness

1 day

Pull Requests (30d)
2
Issues (30d)
25
Star History
359 stars in the last 30 days

Explore Similar Projects

Starred by Victor Taelin Victor Taelin(Author of Bend, Kind, HVM) and Eric Zhu Eric Zhu(Coauthor of AutoGen; Research Scientist at Microsoft Research).

chat-with-gpt by cogentapps

0.0%
2k
Open-source ChatGPT app with voice
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.