Discover and explore top open-source AI tools and projects—updated daily.
ngxsonWebcam demo using SmolVLM for real-time object detection
Top 10.4% on SourcePulse
This repository provides a real-time webcam demonstration of object detection using the SmolVLM 500M model integrated with llama.cpp. It is designed for developers and researchers interested in on-device, real-time visual question answering and object recognition.
How It Works
The demo leverages the llama.cpp server to host the SmolVLM model, enabling efficient inference. The web interface, served by index.html, captures video frames from the user's webcam, sends them to the llama.cpp server for processing by SmolVLM, and displays the results. This approach allows for local, real-time execution without relying on external cloud APIs.
Quick Start & Requirements
llama.cpp server with the SmolVLM 500M GGUF model: run llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF.-ngl 99 to the server command.index.html in a web browser.llama.cpp compiled, a compatible web browser.Highlighted Details
llama.cpp server.Maintenance & Community
No specific community channels or maintenance details are provided in the README.
Licensing & Compatibility
The repository itself does not specify a license. The underlying SmolVLM model and llama.cpp have their own licenses, which should be consulted for usage terms.
Limitations & Caveats
The demo is presented as a simple example and may require further configuration for optimal performance or specific use cases. GPU acceleration setup is noted as potentially necessary.
5 months ago
1 day
day50-dev
lxe
QwenLM
cvlab-columbia