intrascribe by weynechen

Local-first, privacy-focused speech-to-text and summarization platform for internal networks

Created 5 months ago

597 stars

Top 54.6% on SourcePulse

Project Summary

IntraScribe is a self-hosted, privacy-focused speech-to-text and summarization platform designed for internal network deployment in enterprises, schools, and government organizations. It offers real-time transcription, speaker diarization, batch processing, AI-powered summarization, and title generation, with a fully decoupled architecture allowing for flexible integration of various audio capture and transmission methods. The platform prioritizes data privacy and compliance by keeping all data within the local network.

How It Works

IntraScribe utilizes a modular architecture. Real-time transcription is handled via WebRTC for audio streaming from the browser to the backend, with results returned through Server-Sent Events (SSE). For higher quality and structured output, audio is cached, uploaded to Supabase Storage, and then processed for speaker diarization using pyannote.audio and re-transcription. AI summarization and title generation are managed by LiteLLM, allowing for configurable models and fallback strategies. Data persistence and real-time updates are managed through Supabase, leveraging Postgres for data, Auth for authentication, Storage for files, and Realtime for event subscriptions.

Quick Start & Requirements

Installation: Clone the repository, set up Supabase locally, configure environment variables (.env.local for web, .env for backend), install backend dependencies with uv, and start the backend and frontend.
Prerequisites:
- NVIDIA GPU with CUDA (CPU fallback available but untested).
- Node.js 18+, Python 3.10+, uv.
- Ollama with a model like qwen3:8b (configurable).
- FFmpeg.
- Supabase CLI.
- Hugging Face token for pyannote.audio.
Setup: Initial Supabase setup and model downloads can be time-consuming. Local HTTPS setup with mkcert is recommended for intra-network use.
Links: Supabase Local Development, uv Installation, mkcert.

Highlighted Details

Supports local, offline, and privacy-sensitive deployments.
Features team collaboration with account systems and template sharing.
Decoupled frontend allows integration with various hardware and transmission protocols.
Editable transcriptions with preserved timestamps and speaker information.

Maintenance & Community

MIT License.
TODO section mentions plans for hardware integration and AI dialogue features.

Licensing & Compatibility

MIT License, generally permissive for commercial use and closed-source linking.

Limitations & Caveats

The project has primarily been tested on Ubuntu 22.04.
Speaker diarization may fail if the Hugging Face token is not configured or if models require authorization, with a fallback to a single speaker.
Audio processing failures may occur if FFmpeg is not installed or not in the system's PATH.

Health Check

Last Commit

3 months ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

1

Star History

10 stars in the last 30 days

Explore Similar Projects

leopard by Picovoice

Private, on-device speech-to-text engine

Created 6 years ago

Updated 1 week ago

amical by amicalhq

Local-first AI dictation and note-taking app

Created 8 months ago

Updated 2 days ago

Starred by

Michael Han

Michael Han(Cofounder of Unsloth).

FluidVoice by altic-dev

macOS app for local voice-to-text transcription with AI enhancement

Created 3 months ago

Updated 1 day ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera) and

Jeffrey Morgan

Jeffrey Morgan(Cofounder of Ollama).

transcriptionstream by transcriptionstream

Self-hosted service for offline transcription, diarization, and LLM summarization

Created 2 years ago

Updated 1 year ago

Dia-TTS-Server by devnen

Self-host a powerful TTS model with an OpenAI-compatible API

Created 8 months ago

Updated 7 months ago

Speech-Translate by Dadangdut33

Speech-to-text app using Whisper for transcription and translation

Created 3 years ago

Updated 2 years ago

Starred by

Emile Vauge

Emile Vauge(Founder of Traefik).

Scriberr by rishikanthc

Self-hosted app for local AI audio transcription

Created 1 year ago

Updated 4 days ago

Starred by

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind).

whisper-plus by kadirnar

Speech-to-text toolkit for enhanced audio processing

Created 2 years ago

Updated 1 month ago

Starred by

Abubakar Abid

Abubakar Abid(Cofounder of Gradio).

Chatterbox-TTS-Server by devnen

Self-host a powerful TTS server with a web UI and API

Created 7 months ago

Updated 3 weeks ago

Whisper-WebUI by jhj0517

Web UI for Whisper-based subtitle generation

Created 2 years ago

Updated 1 week ago

Starred by

Amin Ahmad

Amin Ahmad(Cofounder of Vectara) and

Jeremy Howard

Jeremy Howard(Cofounder of fast.ai).

whisper_streaming by ufal

Real-time streaming for long speech-to-text transcription/translation

Created 2 years ago

Updated 2 months ago

Starred by

Georgios Konstantopoulos

Georgios Konstantopoulos(CTO, General Partner at Paradigm).

meeting-minutes by Zackriya-Solutions

Local AI meeting assistant for real-time transcription and summarization

Created 1 year ago

Updated 1 week ago

Feedback? Help us improve.