sam.cpp by YavorGIvanov

C/C++ inference for Meta's Segment Anything Model (SAM)

Created 2 years ago

1,282 stars

Top 30.9% on SourcePulse

View on GitHub

2 Experts Love This Project

Jonathan Ragan-Kelley

Professor at MIT

Georgi Gerganov

Author of llama.cpp, whisper.cpp

Project Summary

This project provides a pure C/C++ implementation for Meta's Segment Anything Model (SAM), enabling efficient on-device inference without Python dependencies. It targets developers and researchers needing to integrate advanced image segmentation capabilities into C/C++ applications, offering a lightweight and performant solution.

How It Works

The project leverages the ggml library for tensor computation, allowing SAM to run efficiently on CPUs. It converts the original PyTorch model checkpoints (.pth) into a custom ggml format (.bin). The inference process involves loading the ggml model, preprocessing the input image to the required 1024x1024 resolution, and then running the SAM model to generate segmentation masks.

Quick Start & Requirements

Install: git clone --recursive https://github.com/YavorGIvanov/sam.cpp && cd sam.cpp
Prerequisites: Python 3, PyTorch, NumPy (for model conversion), CMake, SDL2 (for GUI).
Model Conversion: Download .pth checkpoint, run python convert-pth-to-ggml.py <path_to_pth> <output_dir>.
Build: mkdir build && cd build && cmake .. && make -j4
Inference: ./bin/sam -t <threads> -i <image_path> -m <ggml_model_path>
Docs: https://github.com/YavorGIvanov/sam.cpp

Highlighted Details

Pure C/C++ implementation of SAM.
Utilizes ggml for CPU-optimized inference.
Supports conversion of official PyTorch checkpoints.
Includes basic GUI via SDL2 for visualization.

Maintenance & Community

The project is actively maintained by YavorGIvanov. Community interaction channels are not explicitly mentioned in the README.

Licensing & Compatibility

The project appears to be licensed under the MIT License, allowing for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is still under active development, with several features listed as "Next steps" including GPU support, mask/box input, and further performance optimizations. Some output differences compared to the PyTorch implementation are noted.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

4 stars in the last 30 days