openai-voice-agent-sdk-sample  by openai

Conversational AI voice agent SDK sample

Created 11 months ago
251 stars

Top 99.9% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides a sample application demonstrating how to integrate voice capabilities into applications using the OpenAI Agents SDK. It targets developers seeking a customizable starting point for building voice-enabled conversational assistants, offering a foundation for custom AI interactions.

How It Works

The project employs a dual-architecture approach: a Python backend powered by FastAPI exposes a WebSocket endpoint for real-time, bidirectional communication. Complementing this, a Next.js frontend connects to the WebSocket server, managing user input and rendering AI outputs. This design pattern allows for efficient handling of streaming data and interactive conversational flows, enabling developers to customize both the AI's reasoning capabilities and the user-facing interface.

Quick Start & Requirements

  • Install/Run: Execute make sync to install dependencies, followed by make serve to run the application. The app will be accessible at http://localhost:3000.
  • Prerequisites: An OpenAI API key is required. Additionally, Node.js, npm, and the uv package manager must be installed on your system.
  • Setup: Set your OpenAI API key either globally via the OPENAI_API_KEY environment variable or locally in a .env file at the project root.

Highlighted Details

  • Robust multi-turn conversation handling for natural dialogue flow.
  • Integrated push-to-talk audio mode for intuitive voice input.
  • Support for function calling, allowing the AI to execute external tools or actions.
  • Real-time streaming of responses and tool calls, enhancing user experience with immediate feedback.

Maintenance & Community

Contributions via issues or pull requests are welcomed, though not all suggestions are guaranteed to be reviewed. No specific community channels or roadmap links are provided in the README.

Licensing & Compatibility

This project is licensed under the MIT License, which is permissive for commercial use and integration into closed-source projects.

Limitations & Caveats

The sample application is intended as a foundational starting point. Contributions may not always receive a review, and the project does not offer extensive support beyond the provided code.

Health Check
Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.