Youtube_demos by yeyu2

Collection of demos for multimodal AI applications

Created 2 years ago

366 stars

Top 77.0% on SourcePulse

Project Summary

This repository serves as a collection of code examples and demonstrations for building advanced AI applications, primarily focusing on multi-agent systems and real-time multimodal interactions. It targets developers and researchers interested in leveraging large language models (LLMs) like Gemini and Llama for complex tasks, offering practical implementations for voice chat, screen sharing, document analysis, and custom agent workflows.

How It Works

The project showcases various frameworks and techniques for orchestrating multiple AI agents, including AutoGen, CrewAI, and Swarm. It demonstrates real-time data processing for multimodal inputs (voice, camera, screen) and outputs, often integrating with LLM APIs (Google Gemini, Groq, OpenRouter) and local models (Gemma, Llama). Key architectural patterns include Retrieval Augmented Generation (RAG) for document interaction and function calling for agent capabilities, enabling sophisticated AI-driven applications.

Quick Start & Requirements

Installation: Primarily involves cloning the repository and installing Python dependencies via pip. Specific setup varies per demo.
Prerequisites: Python 3.x, potentially specific LLM API keys, and for local model execution, significant GPU resources and CUDA may be required.
Resources: Demos range from simple scripts to complex applications requiring substantial compute for local model inference.
Links: Each demo directory includes a YouTube link for detailed walkthroughs and explanations.

Highlighted Details

Extensive coverage of Google Gemini 2.0 Multimodal Live API for real-time applications.
Demonstrations of local LLM deployment and integration using Ollama and Gemma.
Practical examples of building web UIs for AI agents using frameworks like Panel and Streamlit.
Implementation of advanced agent features such as function calling, long-term memory, and RAG.

Maintenance & Community

The repository is maintained by Yeyu Lab, with a strong emphasis on YouTube video tutorials accompanying each code example. Further community engagement or support channels are not explicitly detailed in the README.

Licensing & Compatibility

The repository's licensing is not specified in the README. Compatibility for commercial use or closed-source linking would require explicit clarification from the maintainer.

Limitations & Caveats

The project is a collection of demos, not a cohesive framework, meaning integration between different examples may require significant adaptation. Some demos may rely on specific, potentially costly, API versions or require substantial hardware for local execution.

Youtube_demos by yeyu2

Explore Similar Projects

allchat by msveshnikov

wingman-ai by RussellCanfield

dexto by truffle-ai

index by lmnr-ai

comfyui_LLM_party by heshengtao

ai-gradio by AK391

cactus by cactus-compute

OmAgent by om-ai-lab

py-gpt by szczyglis-dev

llm-python by onlyphantom

ai by vercel

awesome-llm-apps by Shubhamsaboo