dsnote  by mkiol

Linux app for offline note-taking, reading, and translation

Created 4 years ago
1,133 stars

Top 33.9% on SourcePulse

GitHubView on GitHub
Project Summary

Speech Note is a Linux and Sailfish OS desktop application designed for offline note-taking, reading, and translation using Speech-to-Text (STT), Text-to-Speech (TTS), and Machine Translation (MT) engines. It prioritizes user privacy by processing all data locally, making it suitable for users who require secure, private voice-to-text and translation capabilities without internet connectivity.

How It Works

The application leverages a modular architecture, supporting multiple STT (Coqui STT, Vosk, Whisper.cpp, Faster Whisper, April-ASR), TTS (espeak-ng, Piper, RHVoice, Coqui TTS, Mimic 3, WhisperSpeech), and MT (Bergamot Translator) engines. This allows users to select and download models for various languages, offering flexibility in choosing the best-performing or most suitable engine for their needs. All processing is performed locally, ensuring data privacy and offline functionality.

Quick Start & Requirements

  • Installation: Primarily via Flatpak:
    • Base: flatpak install net.mkiol.SpeechNote
    • NVIDIA Add-on: flatpak install net.mkiol.SpeechNote.Addon.nvidia
    • Arch Linux (AUR): dsnote or dsnote-git
    • openSUSE: zypper in speechnote
  • Dependencies: Flatpak packages include heavy libraries like CUDA, ROCm, Torch, and Python. GPU acceleration add-ons are available for NVIDIA (recommended) and AMD (not recommended).
  • Resources: Base Flatpak download is 0.9 GiB, unpacking to 3.2 GiB. NVIDIA add-on adds 3.7 GiB download / 6.4 GiB unpacked.
  • Docs: https://github.com/mkiol/dsnote

Highlighted Details

  • Supports over 60 languages for STT, TTS, and MT.
  • Offers both "Base" (full features) and "Tiny" (basic features, smaller footprint) Flatpak packages.
  • Extensive model browser for downloading STT, TTS, and MT models directly within the app.
  • Custom model support via editing models.json.

Maintenance & Community

  • Project hosted on GitHub and GitLab.
  • Contributions welcome via PR/MR or issue reporting.
  • Translations managed via Transifex.
  • Support options include starring the repo, writing reviews, and donations via ko-fi or Liberapay.

Licensing & Compatibility

  • Speech Note is licensed under the Mozilla Public License Version 2.0.
  • Dependencies use a mix of MPL 2.0, Apache 2.0, MIT, BSD, LGPL, and GPL licenses. Notably, RHVoice and espeak-ng are GPL, and Mimic 3 is AGPL-3.0, which may have implications for linking in closed-source applications.

Limitations & Caveats

  • Faster Whisper, Coqui TTS, and Mimic3 models are only available on x86-64 architecture.
  • The AMD add-on is large, offers limited benefits, and may cause issues with ROCm 6.x.
  • Some experimental models are marked as "likely doesn't work well."
Health Check
Last Commit

3 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
11
Star History
82 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.