machina by PsyChip

CCTV viewer for realtime object tagging

Created 1 year ago

780 stars

Top 45.0% on SourcePulse

1 Expert Loves This Project

dguido

Cofounder of Trail of Bits

Project Summary

MACHINA is a video surveillance system that leverages OpenCV, YOLO, and LLAVA for real-time object tagging and scene captioning. It is designed for users who need to monitor video streams and gain insights into detected objects and overall scene context. The system aims to provide a headless security solution.

How It Works

MACHINA connects to RTSP streams, processing frames in a separate thread. YOLO detects objects, assigning unique IDs based on position and time. A background thread uses LLM requests (Ollama server with LLAVA) for object tagging. For scene captioning, BLIP generates captions every 30 frames, and CLIP matches these captions against pre-generated text every 10 frames, enabling real-time scene descriptions.

Quick Start & Requirements

Install: Clone the repository, install dependencies (pip install -r requirements.txt), uninstall CPU PyTorch and install CUDA version (pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118), and run (py app.py).
Prerequisites: Python 3.12.x, Ollama server with LLAVA model, CUDA-enabled PyTorch, Visual C++ redistributables (Windows).
Notes: CUDA-enabled PyTorch is essential for real-time performance. Adjust vsize based on the YOLO model used.
Links: Microsoft VC++ Redistributables

Highlighted Details

Achieves 600ms average for captioning and 47ms for caption matching on an RTX 3060.
Processes frames at 640x480, achieving 20ms interference time with YOLOv11-small on a GTX 1060.
Implements object matching based on detection box centers with a 16px tolerance.
Supports snapshotting (S), scene captioning (C), and recording (R).

Maintenance & Community

This is a personal project developed in spare time.
Feature requests can be prioritized with donations via Ko-fi or Bitcoin.
Contact: root@psychip.net

Licensing & Compatibility

The README does not explicitly state a license.

Limitations & Caveats

The project is marked as "[WIP]" (Work In Progress).
Stream delays can occur due to network conditions, with a frame skip mechanism implemented.
Pre-trained YOLO models may lack accuracy on low-resolution streams; custom training is recommended.
No explicit mention of supported operating systems beyond Windows prerequisites.

Health Check

Last Commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)

0

Issues (30d)

0

Star History

0 stars in the last 30 days

Explore Similar Projects

object-centric-ovd by hanoonaR

Object detection research paper for open-vocabulary scenarios

Created 3 years ago

Updated 3 years ago

Awesome-CV-MasterHub by cuixing158

CV paper list for recent computer vision research

Created 11 months ago

Updated 2 days ago

CIoU by Zzh-tju

Object detection research paper enhancing bounding box regression

Created 5 years ago

Updated 2 years ago

moment_detr by jayleicn

Video moment retrieval via natural language queries (NeurIPS 2021 paper)

Created 4 years ago

Updated 1 year ago

efficientdet-pytorch by bubbliiiing

PyTorch code for EfficientDet object detection

Created 5 years ago

Updated 2 years ago

ComfyUI-YoloWorld-EfficientSAM by ZHO-ZHO-ZHO

ComfyUI nodes for object detection and segmentation workflows

Created 1 year ago

Updated 1 year ago

Starred by

Deshraj Yadav

Deshraj Yadav(Cofounder of Mem0).

awesome-object-proposals by caocuong0306

Curated list of object proposals resources for object detection

Created 9 years ago

Updated 8 years ago

Starred by

Travis Fischer

Travis Fischer(Founder of Agentic),

Evan Hubinger

Evan Hubinger(Head of Alignment Stress-Testing at Anthropic), and

1 more.

awesome-cbir-papers by willard-yuan

Papers for content-based image retrieval (CBIR) in academia/industry

Created 10 years ago

Updated 2 years ago

Grounded-SAM-2 by IDEA-Research

Video object tracker using open-world models

Created 1 year ago

Updated 2 months ago

Starred by

Andrej Karpathy

Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n),

Deshraj Yadav

Deshraj Yadav(Cofounder of Mem0), and

7 more.

rcnn by rbgirshick

Object detection system using CNNs and region proposals

Created 12 years ago

Updated 8 years ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"),

Simon Willison

Simon Willison(Coauthor of Django), and

10 more.

LAVIS by salesforce

Library for language-vision AI research

Created 3 years ago

Updated 1 year ago

Starred by

Omar Sanseviero

Omar Sanseviero(DevRel at Google DeepMind) and

Joseph Nelson

Joseph Nelson(Cofounder of Roboflow).

notebooks by roboflow

CV tutorials for state-of-the-art models

Created 3 years ago

Updated 5 days ago

Feedback? Help us improve.