Owl by OwlAIProject

Wearable AI captures life experiences, running locally

Created 2 years ago

628 stars

Top 52.7% on SourcePulse

Project Summary

Owl is a personal, always-on wearable AI system designed for continuous life logging and proactive assistance. It targets individuals interested in memory augmentation, productivity enhancement, and exploring novel human-computer interactions through multimodal data capture and local AI inference. The project aims to provide a transparent and open platform for developing and deploying such systems.

How It Works

Owl operates through a distributed architecture comprising wearable capture devices, an AI server, and presentation clients. Wearable devices (e.g., ESP32, Sony Spresense, Apple Watch) capture audio and location data, with plans for image and video. This data is streamed or chunked to a central AI server. The server processes the data using flexible inference options, supporting local models via Ollama or commercial APIs like GPT-4 and Whisper. It employs VAD-based endpointing for conversation segmentation and utilizes background processing queues for transcription, summarization, and data storage.

Quick Start & Requirements

Server Setup: Instructions provided for macOS, Linux, Windows, and Docker.
Capture Devices: Supports custom ESP platforms, Sony Spresense, and Apple Watch. Reference hardware "Bee" is available for community testing.
Clients: Native iOS and web interfaces are available; Android support is planned.
Dependencies: Python, FastAPI, Ollama (for local models), or API keys for commercial services. Specific hardware may require custom firmware.
Resources: Server hosting and potential GPU for local inference.
Documentation: Setup Guide, Technical Guide

Highlighted Details

Multimodal Capture: Supports audio and location, with image and video planned.
Flexible Inference: Integrates with Ollama for local LLMs/VLMs and commercial APIs (GPT-4, Deepgram).
Wearable Support: Broad device compatibility, including custom hardware like the "Bee" device with 50-hour battery life.
Data Flow Transparency: Detailed "Tour de Source" explains data path from capture to processed output.

Maintenance & Community

Key Contributors: Ethan Sutin, Bart Trzynadlowski.
Community: Primarily via Discord.
Roadmap: Not explicitly detailed, but feature development is ongoing.

Licensing & Compatibility

License: Not explicitly stated in the README. Compatibility for commercial use or closed-source linking is undetermined.

Limitations & Caveats

The project is experimental, with ongoing development for vision/video capture and Android support.
Conversation detection relies on VAD, which is noted as a "naive and unreliable heuristic."
Security recommendations highlight potential data exposure if not hosted securely (e.g., over HTTP).

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

0

Issues (30d)

0

Star History

3 stars in the last 30 days

Explore Similar Projects

Starred by

Jason Miller

Jason Miller(Author of Preact).

swama by Trans-N-ai

High-performance LLM inference engine for macOS

Created 8 months ago

Updated 2 weeks ago

sagittarius by gregsadetsky

Web tool for multimodal interaction with GPT-4 and Gemini

Created 2 years ago

Updated 2 years ago

onju-voice by justLV

Hackable AI home assistant platform using Google Nest Mini form factor

Created 2 years ago

Updated 1 year ago

natively-cluely-ai-assistant by evinjohnn

Real-time, privacy-first AI assistant for live conversations

Created 4 weeks ago

Updated 1 day ago

OpenEmbodied by gizwits

AI IoT solution for commercial use

Created 10 months ago

Updated 2 months ago

fay-ue5 by xszyou

UE5 project for digital human integration

Created 2 years ago

Updated 1 year ago

xiaozhi-esp32-server-java by joey-zhou

Java server for ESP32 device management, offering a full-stack solution

Created 1 year ago

Updated 1 day ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

Open-LLM-VTuber by Open-LLM-VTuber

Voice-interactive AI companion with Live2D avatar, runs locally

Created 2 years ago

Updated 2 weeks ago

Starred by

Gabriel Almeida

Gabriel Almeida(Cofounder of Langflow),

Lysandre Debut

Lysandre Debut(Chief Open-Source Officer at Hugging Face), and

1 more.

glass by pickle-com

Desktop AI assistant for real-time context understanding

Created 7 months ago

Updated 4 months ago

chatgpt-web-midjourney-proxy by Dooy

One-UI for multimodal AI tasks

Created 2 years ago

Updated 3 weeks ago

Starred by

Jeff Hammerbacher

Jeff Hammerbacher(Cofounder of Cloudera),

Luis Capelo

Luis Capelo(Cofounder of Lightning AI), and

10 more.

MiniCPM-o by OpenBMB

MLLM for vision, speech, and multimodal live streaming on your phone

Created 2 years ago

Updated 2 days ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems").

Fay by xszyou

Open-source MCP framework for digital humans and LLM integration

Created 3 years ago

Updated 3 weeks ago

Feedback? Help us improve.