audioflare  by seanoliver

AI audio playground using Cloudflare AI Workers

created 1 year ago
442 stars

Top 68.9% on sourcepulse

GitHubView on GitHub
Project Summary

Audioflare is an AI-powered audio processing playground designed for developers and researchers interested in leveraging Cloudflare's AI Workers. It offers a unified interface to transcribe, summarize, analyze sentiment, and translate audio files, demonstrating a practical workflow for multi-step AI tasks within the Cloudflare ecosystem.

How It Works

The project orchestrates a series of Cloudflare AI Workers to process audio. It begins with speech-to-text transcription using OpenAI's Whisper API, followed by text summarization via Meta's Llama-2 model. Sentiment analysis is performed using Huggingface's DistilBERT, and translation into nine languages is handled by Meta's m2m100 model. Cloudflare's AI Gateway provides observability, including analytics, logging, caching, and rate limiting for these worker interactions.

Quick Start & Requirements

Highlighted Details

  • Integrates Cloudflare's Speech to Text, LLM, Text Classification, and Translation AI workers.
  • Demonstrates AI Gateway for observability, caching, and rate limiting.
  • Supports drag-and-drop for local audio files and includes sample files.
  • Calculates and displays processing time for each AI task.

Maintenance & Community

The project is a side project by Sean Oliver. Contributions are welcomed via pull requests and issues.

Licensing & Compatibility

Distributed under the MIT License. This license permits commercial use and integration with closed-source projects.

Limitations & Caveats

Audio transcription is limited to the first 30 seconds of any uploaded file. The LLM summarization model may struggle with lengthy prompts. Cloudflare's AI models are noted as being in 'beta'.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.