aTrain by JuergenFleiss

GUI tool for offline speech transcription, speaker diarization

Created 2 years ago

1,050 stars

Top 35.6% on SourcePulse

Project Summary

aTrain is a GUI tool for offline, privacy-preserving speech-to-text transcription and speaker diarization, designed for researchers and users needing to process sensitive audio data without cloud uploads. It leverages state-of-the-art models for fast, accurate transcriptions across 99 languages and integrates with qualitative analysis software.

How It Works

aTrain utilizes the faster-whisper implementation for high-quality, accelerated transcription and pyannote.audio for speaker diarization. This approach ensures local processing for privacy and GDPR compliance, offering significant speedups over standard implementations, especially when using NVIDIA GPUs.

Quick Start & Requirements

Install: Windows users can use the Microsoft Store or download an installer from the BANDAS-Center Website. Beta versions are available for macOS (Apple Silicon) and Debian.
Prerequisites: NVIDIA GPU with CUDA toolkit installation is recommended for significant speed improvements.
Links: Microsoft Store, BANDAS-Center Website

Highlighted Details

Fast transcription times, approximately 3x audio length on modern CPUs and down to 20% on NVIDIA GPUs.
Speaker diarization capabilities using pyannote.audio.
Supports 99 languages with varying transcription quality.
Outputs compatible with MAXQDA, ATLAS.ti, and nVivo, including timestamped audio playback.

Maintenance & Community

Developed by researchers at the University of Graz and tested by Know-Center Graz. A developer wiki is available for contributions.

Licensing & Compatibility

The README does not explicitly state the license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Beta versions are available for macOS and Debian, suggesting potential stability issues on these platforms. The roadmap indicates ongoing development, with features like batch processing and customizable settings still planned.

aTrain by JuergenFleiss

Explore Similar Projects

speechlib by NavodPeiris

Whisper-transcription_and_diarization-speaker-identification- by lablab-ai

AudioToText by Carleslc

nlp by Majdoddin

transcriber_app by davabase

speech-to-text by reriiasu

whisper-plus by kadirnar

Scriberr by rishikanthc

noScribe by kaixxx

whisper-asr-webservice by ahmetoner

RealtimeSTT by KoljaB

buzz by chidiwilliams