speech-assistant-openai-realtime-api-python by twilio-samples

Speech assistant using Twilio Voice and OpenAI Realtime API

Created 1 year ago

338 stars

Top 81.6% on SourcePulse

Project Summary

This project demonstrates a real-time AI voice assistant for phone calls using Twilio Voice Media Streams and OpenAI's Realtime API. It targets developers building interactive voice applications who want to integrate advanced conversational AI. The primary benefit is enabling natural, two-way voice conversations between callers and an AI assistant.

How It Works

The application establishes simultaneous WebSocket connections with Twilio's Media Streams and OpenAI's Realtime API. Audio captured from phone calls via Twilio is streamed to OpenAI for speech-to-text processing and AI response generation. The AI's synthesized speech is then sent back through Twilio to the caller, creating a seamless, real-time conversational flow. This approach minimizes latency by avoiding intermediate storage or batch processing.

Quick Start & Requirements

Install dependencies: pip install -r requirements.txt
Run the application: python main.py
Prerequisites: Python 3.9+, Twilio account and number, OpenAI account and API Key, ngrok for local tunneling.
Setup involves configuring Twilio to point to your ngrok URL and setting the OpenAI API key in a .env file.

Highlighted Details

Real-time, two-way audio streaming between Twilio and OpenAI.
Supports AI preemption (interrupt handling) via input_audio_buffer.speech_started and conversation.item.truncate.
Outbound calling is noted as beyond the scope but a demo is linked.

Maintenance & Community

No specific contributors, sponsorships, or community links (Discord/Slack) are mentioned in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project focuses on inbound calls; outbound calling is not directly supported. The use of ngrok is required for local development, implying potential complexities for production deployment without further configuration.

speech-assistant-openai-realtime-api-python by twilio-samples

Explore Similar Projects

gpt-voice-conversation-chatbot by Adri6336

kitt by livekit-examples

MIGPT by Afool4U

voice-chat-ai by bigsk1

voice-chat-pdf by run-llama

telegram-chatgpt-concierge-bot by RafalWilinski

ElatoAI by akdeb

speech-assistant-openai-realtime-api-node by twilio-samples

bolna by bolna-ai

RealtimeVoiceChat by KoljaB

chat-with-gpt by cogentapps

gpt-ai-assistant by memochou1993