Scriberr  by rishikanthc

Self-hosted app for local AI audio transcription

created 10 months ago
1,030 stars

Top 37.1% on sourcepulse

GitHubView on GitHub
Project Summary

Scriberr is a self-hostable AI audio transcription application designed for users who need to transcribe and optionally summarize audio files locally. It leverages OpenAI's Whisper models via the WhisperX engine for high-performance transcription and supports offline speaker diarization, making it suitable for researchers, content creators, and developers requiring private and customizable transcription workflows.

How It Works

Scriberr utilizes the WhisperX engine for efficient, local audio transcription, supporting both CPU and NVIDIA GPU acceleration. It integrates with HuggingFace models for offline speaker diarization and offers summarization capabilities via Ollama or OpenAI's ChatGPT API, allowing custom prompt engineering. The application is containerized using Docker Compose for simplified deployment and management.

Quick Start & Requirements

  • Install/Run: Clone the repository, configure .env variables, and run with docker-compose up -d (CPU) or docker-compose -f docker-compose.yml -f docker-compose.gpu.yml up -d (GPU).
  • Prerequisites: Docker and Docker Compose. NVIDIA GPU and NVIDIA Container Toolkit for GPU acceleration. HuggingFace API Key for speaker diarization model download.
  • Setup: Access the web interface at http://localhost:3000.
  • Docs: Installation Instructions

Highlighted Details

  • Fast local transcription using WhisperX.
  • Supports CPU and NVIDIA GPU acceleration.
  • Offline speaker diarization with HuggingFace models.
  • Customizable transcript summarization via Ollama or OpenAI API.
  • Responsive, mobile-ready UI with a glassmorphism design.

Maintenance & Community

The project is under active development with recent breaking changes noted. Contributions are welcome via pull requests and issues.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source projects.

Limitations & Caveats

The project is under active development, and recent updates include breaking changes, requiring data backup and careful re-installation. GPU support requires specific NVIDIA hardware and configuration. Speaker diarization requires accepting HuggingFace model terms and providing an API key for initial setup.

Health Check
Last commit

4 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
20
Star History
224 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.