ha-llmvision  by valentinfrlch

Home Assistant integration for multimodal LLM vision

Created 1 year ago
1,088 stars

Top 34.9% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides a Home Assistant integration for analyzing images and video streams using multimodal Large Language Models (LLMs). It targets Home Assistant users who want to leverage AI for intelligent analysis of camera feeds, video files, and events, enabling features like object recognition, event summarization, and timeline tracking.

How It Works

LLM Vision integrates with various LLM providers, including OpenAI, Anthropic Claude, Google Gemini, and local solutions like Ollama and LocalAI. It processes visual data (images, video, live feeds) and uses LLMs to extract information, identify objects, people, or pets, and maintain a chronological timeline of events. This allows for intelligent sensor updates and natural language querying of past events.

Quick Start & Requirements

Highlighted Details

  • Supports a wide range of LLM providers, including OpenAI-compatible endpoints.
  • Analyzes images, video files, live camera feeds, and Frigate events.
  • Maintains a timeline of analyzed events, viewable on a dashboard.
  • Can intelligently summarize camera event notifications via a blueprint.

Maintenance & Community

  • Project is actively maintained.
  • Community discussions are available via the Home Assistant Community forum.

Licensing & Compatibility

  • The repository does not explicitly state a license in the provided README.

Limitations & Caveats

  • The specific license is not mentioned, which may impact commercial use or integration into closed-source projects.
Health Check
Last Commit

4 days ago

Responsiveness

1 day

Pull Requests (30d)
13
Issues (30d)
12
Star History
44 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.