bolna  by voxos-ai

Open-source platform for building voice-driven multimodal agents

created 1 year ago
421 stars

Top 71.0% on sourcepulse

GitHubView on GitHub
Project Summary

Bolna is an end-to-end, open-source framework for building voice-first, multimodal conversational AI agents. It targets developers and researchers looking to quickly create production-ready voice applications, enabling features like initiating phone calls, real-time transcription, LLM-driven conversations, and text-to-speech synthesis.

How It Works

Bolna orchestrates a pipeline of specialized components for voice interactions. It leverages providers for telephony (e.g., Twilio), Automatic Speech Recognition (ASR) (e.g., Deepgram), Large Language Models (LLMs) (e.g., OpenAI, Mistral via LiteLLM), and Text-to-Speech (TTS) (e.g., ElevenLabs, AWS Polly). Agents are configured via JSON, defining task flows, toolchains (parallel or sequential processing), and specific provider configurations, allowing for flexible and modular voice agent development.

Quick Start & Requirements

  • Install/Run: Local setup uses Docker. Build images with docker-compose build --no-cache <twilio-app | plivo-app> and run with docker-compose up <twilio-app | plivo-app>.
  • Prerequisites: Requires Docker, a .env file with provider API keys (Twilio/Plivo, Deepgram, LLM provider, TTS provider), and ngrok for tunneling.
  • Resources: Local setup involves four Docker containers (telephony server, Bolna server, ngrok, redis).
  • Docs: https://github.com/bolna-ai/bolna

Highlighted Details

  • Supports multiple telephony providers (Twilio, Plivo) for initiating calls.
  • Integrates with various ASR, LLM, and TTS providers through a unified interface, powered by LiteLLM for LLMs.
  • Agent behavior and conversation flow are defined declaratively using JSON configurations.
  • Offers extensibility for adding new telephony providers by implementing custom input/output handlers and a dedicated server.

Maintenance & Community

  • Community channels include Discord and documentation.
  • Contributions are welcomed via issues and pull requests.

Licensing & Compatibility

  • The repository is open-source. Specific license details are not explicitly stated in the README, but it mentions managed hosted offerings.

Limitations & Caveats

  • The README does not explicitly state the open-source license type, which may impact commercial use or closed-source linking.
  • Local setup requires configuring multiple external service API keys and using ngrok for external access.
Health Check
Last commit

9 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
4 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Andre Zayarni Andre Zayarni(Cofounder of Qdrant), and
2 more.

RealChar by Shaunwei

0.1%
6k
Real-time AI character/companion creation and interaction codebase
created 2 years ago
updated 1 year ago
Feedback? Help us improve.