CCTV viewer for realtime object tagging
Top 46.5% on sourcepulse
MACHINA is a video surveillance system that leverages OpenCV, YOLO, and LLAVA for real-time object tagging and scene captioning. It is designed for users who need to monitor video streams and gain insights into detected objects and overall scene context. The system aims to provide a headless security solution.
How It Works
MACHINA connects to RTSP streams, processing frames in a separate thread. YOLO detects objects, assigning unique IDs based on position and time. A background thread uses LLM requests (Ollama server with LLAVA) for object tagging. For scene captioning, BLIP generates captions every 30 frames, and CLIP matches these captions against pre-generated text every 10 frames, enabling real-time scene descriptions.
Quick Start & Requirements
pip install -r requirements.txt
), uninstall CPU PyTorch and install CUDA version (pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
), and run (py app.py
).vsize
based on the YOLO model used.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
2 weeks ago
1 day