Discover and explore top open-source AI tools and projects—updated daily.
jakoviusVoice-typing and dictation software for Linux desktops
Top 98.0% on SourcePulse
Summary
VOXD is a user-friendly, open-source dictation software for Linux, enabling speech-to-text input in any application. It targets Linux users seeking efficient, hands-free typing, offering local, offline voice processing and optional AI-driven text post-processing. The primary benefit is seamless integration of voice input into the desktop environment without relying on cloud services for core transcription.
How It Works
VOXD uses Whisper.cpp for fast, local, offline ASR. It simulates keyboard input via ydotool, allowing transcribed text to appear directly in any focused application, including Wayland. Optional AI Post-Processing (AIPP) integrates with local (llama.cpp, Ollama) or cloud LLMs to refine transcripts into formats like poems or code. Multiple interfaces (CLI, GUI, Tray, beta VAD) cater to diverse needs.
Quick Start & Requirements
.deb, .rpm, .pkg.tar.zst) from GitHub Releases. Alternatively, clone the repo and run ./setup.sh (requires sudo and a system reboot for ydotool on Wayland). pipx installation is also supported.ydotool is essential for Wayland simulated typing and requires specific setup/reboot. No GPU needed; runs on older CPUs. Optional: llama.cpp, Ollama, or cloud API keys for AIPP.ydotool on Wayland.Highlighted Details
ydotool.--flux).Maintenance & Community
No specific details regarding maintainers, community channels (e.g., Discord, Slack), or project roadmap were found in the provided README.
Licensing & Compatibility
VOXD is MIT licensed. The ydotool dependency (simulated typing) is AGPLv3. MIT permits commercial use and closed-source integration. AGPLv3 may impose obligations if ydotool's functionality is integral to a distributed application.
Limitations & Caveats
--flux (VAD) mode is beta.ydotool setup and a system reboot.llama.cpp or Ollama.uninstall.sh.7 months ago
Inactive