GLaDOS by dnhkng

AI-powered personality core for interactive, embodied assistant

Created 2 years ago

5,262 stars

Top 9.4% on SourcePulse

View on GitHub

3 Experts Love This Project

Gabriel Almeida

Cofounder of Langflow

Thomas Wolf

Cofounder of Hugging Face

Michael Han

Cofounder of Unsloth

Project Summary

This project aims to create a physical, interactive AI embodying GLaDOS from the Portal series, targeting hobbyists and developers interested in embodied AI and robotics. It offers a unique opportunity to build a sophisticated conversational agent with a physical presence, capable of low-latency voice interaction and potential future vision capabilities.

How It Works

The system employs a low-latency pipeline: continuous audio recording buffers data, detecting voice activity. Upon cessation, speech is transcribed and streamed to a local LLM. Sentence-by-sentence LLM output is fed to a text-to-speech engine, enabling concurrent generation and playback for reduced latency. The architecture prioritizes minimal dependencies for constrained hardware, avoiding large frameworks like PyTorch.

Quick Start & Requirements

Installation: Clone the repository, then run python scripts/install.py (or scripts\install.py on Windows).
Prerequisites: Ollama for LLM hosting, Python 3.12, and potentially CUDA drivers/toolkit for NVIDIA GPUs or appropriate ONNX Runtime versions for other accelerators. PortAudio library is required for Linux.
Running: Execute uv run glados or uv run glados tui for the Text UI.
Resources: Requires an LLM (e.g., llama3.2 via Ollama) and an OpenAI-compatible TTS server. Performance is highly dependent on hardware acceleration.
Docs: https://github.com/dnhkng/GLaDOS

Highlighted Details

Aims for sub-600ms response latency.
Supports various local LLMs (via Ollama) and TTS voices (Kokoro).
Experimental support for running on an 8GB SBC (RK3588 NPU).
Future plans include VLM integration for vision and custom vector DB for memory.

Maintenance & Community

Active development with community support via Discord.
Project sponsorship is available.

Licensing & Compatibility

The repository itself appears to be under an unspecified license. The README does not explicitly state a license.

Limitations & Caveats

The project is in active, experimental development, particularly the SBC implementation, and does not guarantee support for complex setup issues. Users may encounter segfaults and require significant troubleshooting, especially on non-standard hardware. Voice interruption loops can occur without proper audio hardware or configuration.

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

66 stars in the last 30 days