clip.cpp  by monatis

Plain C/C++ CLIP inference, dependency-free

Created 2 years ago
520 stars

Top 60.5% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides a dependency-free C/C++ implementation of OpenAI's CLIP model, enabling efficient inference on resource-constrained devices. It targets developers and researchers needing to integrate CLIP for tasks like semantic search or zero-shot labeling without the overhead of large ML frameworks. The key benefit is a lightweight, fast inference engine with minimal dependencies.

How It Works

Leveraging the GGML library, clip.cpp offers highly optimized inference with support for 4-bit, 5-bit, and 8-bit quantization. This approach significantly reduces model size and memory footprint, making it suitable for edge devices and serverless deployments. It supports text-only, vision-only, and two-tower CLIP variants, providing flexibility for various applications.

Quick Start & Requirements

  • Install: pip install clip_cpp (for X64 Linux with AVX2). For other systems or instruction sets, build from source with cmake -DBUILD_SHARED_LIBS=ON .. and make.
  • Prerequisites: Standard Python libraries for pip install. Building from source requires CMake. Models are available on HuggingFace (tagged clip-cpp-gguf).
  • Resources: 4-bit quantized models are approximately 85.6 MB.
  • Links: Colab Notebook, HuggingFace Models

Highlighted Details

  • Dependency-free C/C++ inference engine.
  • Supports 4-bit, 5-bit, and 8-bit quantization.
  • Python bindings available with no external Python package dependencies.
  • Includes examples for basic inference, zero-shot labeling, and semantic image search.

Maintenance & Community

The project is actively maintained. Recent updates include Clojure bindings and a switch to the GGUF model format. Discussions and support can be found via GitHub issues.

Licensing & Compatibility

The project is licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

Image preprocessing uses linear interpolation, which may differ numerically from PIL's bicubic interpolation with antialiasing. The GGUF format is a breaking change from previous .bin files.

Health Check
Last Commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Jared Palmer Jared Palmer(Ex-VP AI at Vercel; Founder of Turborepo; Author of Formik, TSDX), Eugene Yan Eugene Yan(AI Scientist at AWS), and
2 more.

starcoder.cpp by bigcode-project

0%
456
C++ example for StarCoder inference
Created 2 years ago
Updated 2 years ago
Starred by Luca Soldaini Luca Soldaini(Research Scientist at Ai2), Edward Sun Edward Sun(Research Scientist at Meta Superintelligence Lab), and
4 more.

parallelformers by tunib-ai

0%
790
Toolkit for easy model parallelization
Created 4 years ago
Updated 2 years ago
Starred by Junyang Lin Junyang Lin(Core Maintainer at Alibaba Qwen), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
3 more.

neural-compressor by intel

0.2%
2k
Python library for model compression (quantization, pruning, distillation, NAS)
Created 5 years ago
Updated 16 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), and
4 more.

gemma_pytorch by google

0.2%
6k
PyTorch implementation for Google's Gemma models
Created 1 year ago
Updated 3 months ago
Feedback? Help us improve.