lucida by claritylab

Speech and vision personal assistant

Created 11 years ago

4,793 stars

Top 10.2% on SourcePulse

View on GitHub

5 Experts Love This Project

Boris Cherny

Creator of Claude Code; MTS at Anthropic

Luke Metz

Research Scientist at Thinking Machines Lab

and 1 more!

Project Summary

Lucida is a speech and vision-based intelligent personal assistant designed to integrate various backend services like Automatic Speech Recognition (ASR), Image Matching (IMM), Question Answering (QA), and more. It allows users to interact with these services through a unified interface, enabling flexible customization and extension with new functionalities. The project is targeted at developers and researchers looking to build or enhance multimodal AI assistants.

How It Works

Lucida employs a modular architecture where backend services communicate via Thrift RPC. A central "command center" (CMD) orchestrates requests, determining the necessary services based on user input and routing queries through a defined service graph. This graph, specified in configuration files, dictates the data flow between services, allowing for complex interactions and dependencies. Services can be written in various languages (C++, Java, Python) and integrated by implementing the Thrift interface and configuring the command center.

Quick Start & Requirements

Local Development: Run make local to install dependencies and compile services. make start_all starts all services, accessible at http://localhost:3000/.
Docker Deployment: Refer to tools/deploy/ for Docker and Kubernetes deployment instructions.
Prerequisites: Ubuntu 16.04 users may need to set LD_LIBRARY_PATH. Specific service dependencies might vary.

Highlighted Details

Supports adding custom backend services by implementing Thrift interfaces and configuring the command center.
Utilizes Thrift for efficient, language-neutral inter-service communication, supporting both Apache and Facebook Thrift implementations.
The command center uses a query classifier to route requests based on input type (text, image) and content, enabling dynamic service orchestration.

Maintenance & Community

The project is released under a BSD license, with submodules potentially having their own licensing. Contributions are welcomed, with details provided in the CONTRIBUTING section.

Licensing & Compatibility

BSD license. Submodules may have different licenses.

Limitations & Caveats

The REST API for the command center is in active development and subject to change. Adding or removing services requires careful modification of Makefiles and configurations to maintain dependency integrity. The project's complexity in managing dependencies and service graphs might pose a challenge for new contributors.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days