ha-llmvision  by valentinfrlch

Home Assistant integration for multimodal LLM vision

created 1 year ago
990 stars

Top 37.5% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides a Home Assistant integration for analyzing images and video streams using multimodal Large Language Models (LLMs). It targets Home Assistant users who want to leverage AI for intelligent analysis of camera feeds, video files, and events, enabling features like object recognition, event summarization, and timeline tracking.

How It Works

LLM Vision integrates with various LLM providers, including OpenAI, Anthropic Claude, Google Gemini, and local solutions like Ollama and LocalAI. It processes visual data (images, video, live feeds) and uses LLMs to extract information, identify objects, people, or pets, and maintain a chronological timeline of events. This allows for intelligent sensor updates and natural language querying of past events.

Quick Start & Requirements

Highlighted Details

  • Supports a wide range of LLM providers, including OpenAI-compatible endpoints.
  • Analyzes images, video files, live camera feeds, and Frigate events.
  • Maintains a timeline of analyzed events, viewable on a dashboard.
  • Can intelligently summarize camera event notifications via a blueprint.

Maintenance & Community

  • Project is actively maintained.
  • Community discussions are available via the Home Assistant Community forum.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README.

Limitations & Caveats

  • The specific license is not mentioned, which may impact commercial use or integration into closed-source projects.
Health Check
Last commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
5
Issues (30d)
13
Star History
88 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems") and Omar Khattab Omar Khattab(Author of DSPy, ColBERT; Professor at MIT).

langwatch by langwatch

0.9%
2k
LLM ops platform for traces, analytics, evaluations, datasets, and prompt optimization
created 1 year ago
updated 1 day ago
Feedback? Help us improve.