segment-anything-webui  by Kingfish404

Web UI for Meta's Segment Anything Model (SAM)

created 2 years ago
265 stars

Top 97.2% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a web user interface for Meta AI's Segment Anything Model (SAM), integrating CLIP for enhanced functionality. It targets researchers and developers needing an accessible way to interact with and experiment with SAM's image segmentation capabilities. The benefit is a user-friendly, browser-based platform for generating masks and exploring segmentation tasks.

How It Works

The system comprises a Python backend using FastAPI and Uvicorn for serving the SAM and CLIP models, and a Node.js frontend for user interaction. The backend API is designed as pure functions, simplifying deployment and maintenance, though it incurs overhead by re-encoding images on each request. This approach prioritizes ease of use and modularity over raw performance for individual requests.

Quick Start & Requirements

  • Install:
    • Backend: pip install torch torchvision ftfy regex tqdm git+https://github.com/openai/CLIP.git uvicorn[standard] fastapi python-multipart Pillow click
    • Frontend: npm i
    • SAM: pip install git+https://github.com/facebookresearch/segment-anything.git opencv-python pycocotools matplotlib onnxruntime onnx
  • Models: Download SAM checkpoints (vit_b, vit_l, vit_h) to a model/ directory.
  • Run:
    • Backend: python3 scripts/server.py
    • Frontend: npm run dev
    • Docker: docker compose up
  • Prerequisites: Python >= 3.8.13, Node >= 18.15.0 (LTS), CUDA or MPS (optional).
  • Links: Segment Anything, CLIP

Highlighted Details

  • Integrates CLIP alongside SAM for potentially richer segmentation features.
  • Backend API is designed as pure functions for ease of deployment and maintenance.
  • Supports running the model server remotely from the web UI.

Maintenance & Community

  • Developed by Kingfish404.
  • No explicit community links (Discord/Slack) or roadmap mentioned in the README.

Licensing & Compatibility

  • License: MIT.
  • Compatible with commercial use and closed-source linking.

Limitations & Caveats

The current backend implementation is noted as slow due to re-encoding images on each request. The README does not detail specific performance benchmarks or advanced configuration options beyond server address changes.

Health Check
Last commit

4 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Travis Fischer Travis Fischer(Founder of Agentic), and
1 more.

fastmcp by punkpeye

3.7%
2k
TypeScript framework for building MCP servers handling client sessions
created 7 months ago
updated 2 days ago
Feedback? Help us improve.