CLI tool for meeting transcription and analysis using Gemini models
Top 58.6% on sourcepulse
This project provides an AI-powered tool for transcribing and analyzing meeting recordings, targeting users who need to extract insights from audio and video content. It leverages Google's Gemini models to offer features like speaker diarization, meeting summaries, action item extraction, and video analysis, aiming to streamline post-meeting workflows.
How It Works
Offmute employs a multi-stage pipeline that first analyzes content by extracting screenshots and chunking audio. It then generates initial descriptions of visual and audio elements. The core transcription and diarization process uses context-aware audio chunk processing to identify speakers and maintain conversational flow, with real-time progress updates. For report generation, it utilizes a "Spreadfill" technique, creating a report structure with headings and then filling each section independently using the full context, ensuring coherence and detail while updating the report incrementally.
Quick Start & Requirements
npx offmute <Meeting_Location> [options]
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is described as an "experiment" and mentions "Maybe I went a little overboard though," suggesting potential scope creep or experimental stability. The "Experimental Tier" explicitly uses a preview model, which may have inherent instability or undocumented changes.
2 months ago
Inactive