gemini-2-live-api-demo by ViaAnthroposBenevolentia

Vanilla JS web interface for Gemini 2.0 multimodal API

Created 1 year ago

387 stars

Top 74.1% on SourcePulse

View on GitHub

1 Expert Loves This Project

Paige Bailey

DevRel Lead at Google DeepMind

Project Summary

This project offers a vanilla JavaScript web interface for interacting with the Gemini 2.0 Flash Multimodal API, enabling real-time text, audio, video, and screen sharing inputs, along with audio responses and function calling. It's designed for developers and users who want a lightweight, dependency-free client for exploring Gemini's advanced multimodal capabilities.

How It Works

The client leverages modern browser APIs like WebRTC, WebSockets, and Web Audio to establish real-time communication with the Gemini API. It handles audio input, output, video streaming, and screen sharing directly in the browser, minimizing server-side complexity. The use of vanilla JavaScript ensures broad compatibility and a small footprint.

Quick Start & Requirements

Install/Run: Serve index.html using a local HTTP server (e.g., python -m http.server 8000 or npx http-server 8000).
Prerequisites: Modern web browser, Google AI Studio API key. Optional: Deepgram API key for transcription.
Demo: Live Demo on GitHub Pages

Highlighted Details

Full multimodal input: text, audio (with interruption support), webcam video, and screen sharing.
Real-time audio responses and optional real-time transcription via Deepgram.
Function calling support for Gemini 2.0 Flash.
Built entirely with vanilla JavaScript, requiring no external libraries.

Maintenance & Community

Contributions are welcome via issues and pull requests.

Licensing & Compatibility

License: MIT.
Compatibility: Suitable for commercial use and integration into closed-source projects due to the permissive MIT license.

Limitations & Caveats

The project is a simplified client and may not expose all advanced features or configurations of the Gemini API. Deepgram integration for transcription is optional and requires a separate API key.

Health Check

Last Commit

10 months ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

1 stars in the last 30 days