Node.js framework for building realtime multimodal AI agents
Top 63.1% on sourcepulse
This project provides a Node.js framework for building real-time, multimodal AI agents, enabling developers to create conversational voice agents that can process audio and text inputs. It's targeted at developers looking to integrate advanced AI capabilities into real-time communication platforms, offering a server-side participant framework.
How It Works
The framework utilizes a worker-based architecture where server-side processes (workers) manage and orchestrate AI agents. Agents are defined as functions that can compose various plugins for specific tasks like Speech-to-Text (STT), Large Language Models (LLM), and Text-to-Speech (TTS). This modular design allows for flexible integration with different AI service providers. A new phrase endpointing model is available for improved turn detection.
Quick Start & Requirements
pnpm install @livekit/agents
pnpm install @livekit/agents-plugin-openai
(example)LIVEKIT_URL
, LIVEKIT_API_KEY
, LIVEKIT_API_SECRET
, and provider-specific keys (e.g., OPENAI_API_KEY
).node my_agent.js start
Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
This SDK is in beta, meaning APIs may change and bugs may be present. The Python version of the framework is considered more mature and recommended for production use.
3 days ago
1 day