AI device template for voice assistant
Top 91.2% on sourcepulse
This project provides a template for building AI-powered voice assistants, inspired by devices like the Humane AI Pin and Rabbit R1. It targets developers and power users looking to create custom AI experiences with voice input, text-to-speech, image processing, and function calling capabilities, leveraging a variety of leading AI models.
How It Works
The assistant integrates multiple AI services for its core functionalities. Voice input and transcription are handled by OpenAI's Whisper or Groq's Whisper models. Text-to-speech output utilizes OpenAI's TTS models. Image processing can be done via OpenAI's GPT-4 Vision or Fal.ai's Llava-Next. Function calling and dynamic UI rendering are managed by OpenAI's GPT-3.5-Turbo. Configuration is centralized in app/config.tsx
, allowing users to select providers and models for each feature, and optionally enable features like rate limiting (Upstash) and Langchain tracing.
Quick Start & Requirements
npm install
or bun install
npm run dev
or bun dev
http://localhost:3000
.Highlighted Details
Maintenance & Community
The project is maintained by the developer behind Developers Digest. Support options include Patreon and Buy Me A Coffee. Links to the developer's website, GitHub, and Twitter are provided for engagement and updates.
Licensing & Compatibility
The repository does not explicitly state a license in the provided README. Users should verify licensing terms before commercial use or integration into closed-source projects.
Limitations & Caveats
Text-to-speech and function calling currently only support OpenAI providers. The project is inspired by commercial AI devices but is a developer template, requiring significant setup and API key management.
1 year ago
Inactive