openai-voice-agent-sdk-sample by openai

Conversational AI voice agent SDK sample

Created 1 year ago

255 stars

Top 98.8% on SourcePulse

Project Summary

This repository provides a sample application demonstrating how to integrate voice capabilities into applications using the OpenAI Agents SDK. It targets developers seeking a customizable starting point for building voice-enabled conversational assistants, offering a foundation for custom AI interactions.

How It Works

The project employs a dual-architecture approach: a Python backend powered by FastAPI exposes a WebSocket endpoint for real-time, bidirectional communication. Complementing this, a Next.js frontend connects to the WebSocket server, managing user input and rendering AI outputs. This design pattern allows for efficient handling of streaming data and interactive conversational flows, enabling developers to customize both the AI's reasoning capabilities and the user-facing interface.

Quick Start & Requirements

Install/Run: Execute make sync to install dependencies, followed by make serve to run the application. The app will be accessible at http://localhost:3000.
Prerequisites: An OpenAI API key is required. Additionally, Node.js, npm, and the uv package manager must be installed on your system.
Setup: Set your OpenAI API key either globally via the OPENAI_API_KEY environment variable or locally in a .env file at the project root.

Highlighted Details

Robust multi-turn conversation handling for natural dialogue flow.
Integrated push-to-talk audio mode for intuitive voice input.
Support for function calling, allowing the AI to execute external tools or actions.
Real-time streaming of responses and tool calls, enhancing user experience with immediate feedback.

Maintenance & Community

Contributions via issues or pull requests are welcomed, though not all suggestions are guaranteed to be reviewed. No specific community channels or roadmap links are provided in the README.

Licensing & Compatibility

This project is licensed under the MIT License, which is permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The sample application is intended as a foundational starting point. Contributions may not always receive a review, and the project does not offer extensive support beyond the provided code.

openai-voice-agent-sdk-sample by openai

Explore Similar Projects

typeflux by mylxsw

skills by elevenlabs

voice-assistant-whisper-chatgpt by bhattbhavesh91

Patter by PatterAI

claude-phone by theNetworkChuck

pipecat-examples by pipecat-ai

ElatoAI by akdeb

speech-assistant-openai-realtime-api-node by twilio-samples

friday-tony-stark-demo by SAGAR-TAMANG

elevenlabs-python by elevenlabs

Android-MVVM-Architecture-Android-Voice-AI-SDK by ahmedeltaher

dograh by dograh-hq