jarvis by llm-guy

Local voice-controlled AI assistant

Created 7 months ago

276 stars

Top 94.1% on SourcePulse

Project Summary

Summary

Jarvis is a voice-activated AI assistant designed for local, private operation, addressing the need for a hands-free, privacy-conscious interface. It utilizes a wake word to initiate command processing via a local LLM (Qwen via Ollama) and LangChain, responding audibly through TTS. The system supports dynamic tool-calling, offering a flexible and offline-capable conversational AI experience.

How It Works

Jarvis operates by continuously listening for the wake word "Jarvis" via microphone input. Upon detection, it enters an active "conversation mode," records the user's spoken command, and then processes this input using the Qwen 3 (1.7b) LLM, orchestrated through Ollama and LangChain. A key feature is its support for tool-calling, enabling the LLM to invoke dynamic functions, such as retrieving the current time in a specified city. Responses are synthesized using pyttsx3 text-to-speech, with potential for custom voice configurations. The assistant automatically resets to wake-word listening after 30 seconds of user inactivity, ensuring efficient resource management. This local-first architecture significantly enhances user privacy and reduces dependency on external cloud services.

Quick Start & Requirements

Install: pip install -r requirements.txt
Model Setup: Ensure the qwen3:1.7b model is downloaded and available within your Ollama instance.
Run: Execute the main script using python main.py.
Prerequisites: Requires microphone access for voice input and the Ollama service running locally with the specified Qwen model.

Highlighted Details

Voice Activation: Features a dedicated wake word ("Jarvis") for hands-free interaction, allowing seamless activation.
Local LLM: Employs the Qwen 3 (1.7b) model via Ollama, ensuring all processing occurs locally for enhanced privacy and offline capability.
LangChain Integration: Utilizes LangChain for sophisticated natural language understanding, prompt management, and dynamic tool-calling capabilities.
TTS Responses: Generates audible replies using the pyttsx3 library, providing a natural conversational output.

Maintenance & Community

No information provided in the README.

Licensing & Compatibility

No information provided in the README.

Limitations & Caveats

The system's functionality is contingent upon a functional microphone and the Ollama service being actively run locally with the precise qwen3:1.7b model. Performance, particularly LLM inference speed, is directly tied to the user's hardware specifications. The current implementation is tailored to specific models and libraries, potentially limiting extensibility without modification.

Health Check

Last Commit

5 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

9 stars in the last 30 days