muesli  by pHequals7

Local dictation and meeting transcription for macOS

Created 2 months ago
492 stars

Top 62.3% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

Muesli is a native macOS application providing on-device dictation and meeting transcription, prioritizing user privacy and eliminating cloud costs. It targets macOS users seeking a secure, efficient alternative to cloud-based transcription services, offering real-time speech-to-text and AI-powered note generation directly on Apple Silicon hardware.

How It Works

This project is built entirely in Swift, leveraging Apple's CoreML and Metal frameworks for 100% on-device processing. Dictation features low-latency (~0.13s) speech-to-text via the Neural Engine. Meeting transcription captures both microphone and system audio, employing Voice Activity Detection (VAD) for real-time, chunked transcription at natural speech boundaries and speaker diarization to identify participants. AI-powered meeting summarization is integrated, supporting various local and cloud models.

Quick Start & Requirements

  • Installation: Download the .dmg from the Releases page or install via Homebrew (brew tap pHequals7/muesli && brew install --cask muesli).
  • Build from Source: Requires macOS 14.2+ and Xcode 16+. Clone the repository and use provided scripts (./scripts/build_native_app.sh).
  • Prerequisites: Transcription models download automatically on first use (e.g., Parakeet v3 is ~450MB).

Highlighted Details

  • Native Swift Architecture: Pure Swift application utilizing CoreML and Metal, avoiding Python dependencies or complex IPC.
  • Diverse ASR Models: Supports multiple on-device speech-to-text engines including Parakeet TDT (Neural Engine), Cohere Transcribe 2B (CoreML), Whisper variants (CoreML/ANE via WhisperKit), and Qwen3 ASR (CoreML).
  • Advanced Meeting Features: Real-time VAD-driven transcription, speaker diarization, simultaneous mic/system audio capture, camera-based meeting detection, calendar integration, and export options (PDF/Markdown).
  • Agent CLI: Includes a muesli-cli tool for programmatic interaction, exposing transcripts and notes as JSON for integration with AI coding agents.
  • Privacy-Focused: All audio processing occurs locally, ensuring data never leaves the device unless explicitly configured for cloud summarization.

Maintenance & Community

No specific details regarding maintainers, community channels (like Discord/Slack), or project health signals were found in the provided README text.

Licensing & Compatibility

  • License: MIT License.
  • Compatibility: Permissive open-source license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The application is exclusively designed for macOS and primarily optimized for Apple Silicon hardware due to its reliance on CoreML and the Neural Engine. Building from source requires a recent version of Xcode.

Health Check
Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
70
Issues (30d)
34
Star History
358 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.