smolvlm-realtime-webcam  by ngxson

Webcam demo using SmolVLM for real-time object detection

created 2 months ago
4,062 stars

Top 12.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a real-time webcam demonstration of object detection using the SmolVLM 500M model integrated with llama.cpp. It is designed for developers and researchers interested in on-device, real-time visual question answering and object recognition.

How It Works

The demo leverages the llama.cpp server to host the SmolVLM model, enabling efficient inference. The web interface, served by index.html, captures video frames from the user's webcam, sends them to the llama.cpp server for processing by SmolVLM, and displays the results. This approach allows for local, real-time execution without relying on external cloud APIs.

Quick Start & Requirements

  • Install and run llama.cpp server with the SmolVLM 500M GGUF model: run llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF.
  • For GPU acceleration (Nvidia/AMD/Intel), add -ngl 99 to the server command.
  • Open index.html in a web browser.
  • Prerequisites: llama.cpp compiled, a compatible web browser.

Highlighted Details

  • Real-time object detection and visual question answering.
  • Utilizes SmolVLM 500M for efficient on-device processing.
  • Demonstrates integration with llama.cpp server.
  • Customizable instructions for model output.

Maintenance & Community

No specific community channels or maintenance details are provided in the README.

Licensing & Compatibility

The repository itself does not specify a license. The underlying SmolVLM model and llama.cpp have their own licenses, which should be consulted for usage terms.

Limitations & Caveats

The demo is presented as a simple example and may require further configuration for optimal performance or specific use cases. GPU acceleration setup is noted as potentially necessary.

Health Check
Last commit

2 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
4,079 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 14 hours ago
Feedback? Help us improve.