Awesome-CLIP  by yzhuoning

CLIP resources list

Created 4 years ago
1,215 stars

Top 32.3% on SourcePulse

GitHubView on GitHub
Project Summary

This repository is an "Awesome List" curating research resources, papers, and code related to CLIP (Contrastive Language-Image Pre-Training). It serves as a comprehensive catalog for researchers and practitioners exploring CLIP's capabilities and applications across various domains like image generation, object detection, and video understanding.

How It Works

The list organizes projects based on their application area, providing links to associated research papers and their corresponding code repositories. It highlights how CLIP's ability to connect text and image modalities is leveraged for tasks ranging from zero-shot learning to complex generative processes.

Quick Start & Requirements

This is a curated list, not a runnable project. To use any of the listed resources, refer to the individual project's repository for installation and usage instructions.

Highlighted Details

  • Extensive coverage of CLIP applications including GANs, object detection, information retrieval, representation learning, text-to-3D generation, prompt learning, video understanding, image captioning, image editing, segmentation, 3D recognition, audio, and language tasks.
  • Includes links to foundational CLIP papers, training implementations (OpenCLIP, Paddle-CLIP), and numerous downstream task adaptations.
  • Features projects demonstrating CLIP's zero-shot capabilities and its use in guiding generative models like diffusion and StyleGAN.
  • Covers research on adapting CLIP for specific modalities like audio and point clouds, as well as multilingual and fine-tuned versions.

Maintenance & Community

The list is maintained by yzhuoning and welcomes contributions via issues. It is inspired by the "Awesome Visual-Transformer" list.

Licensing & Compatibility

The licensing of individual projects varies and must be checked within each linked repository. This list itself does not impose specific licensing restrictions.

Limitations & Caveats

This is a static list of resources; it does not provide a unified interface or framework for using CLIP. Users must engage with each individual project's setup and dependencies.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 30 days

Explore Similar Projects

Starred by Stas Bekman Stas Bekman(Author of "Machine Learning Engineering Open Book"; Research Engineer at Snowflake), Douwe Kiela Douwe Kiela(Cofounder of Contextual AI), and
1 more.

lens by ContextualAI

0.3%
353
Vision-language research paper using LLMs
Created 2 years ago
Updated 1 month ago
Starred by Jiayi Pan Jiayi Pan(Author of SWE-Gym; MTS at xAI), Shizhe Diao Shizhe Diao(Author of LMFlow; Research Scientist at NVIDIA), and
1 more.

METER by zdou0830

0%
373
Multimodal framework for vision-and-language transformer research
Created 3 years ago
Updated 2 years ago
Starred by Jason Knight Jason Knight(Director AI Compilers at NVIDIA; Cofounder of OctoML), Travis Fischer Travis Fischer(Founder of Agentic), and
5 more.

fromage by kohjingyu

0%
482
Multimodal model for grounding language models to images
Created 2 years ago
Updated 1 year ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Simon Willison Simon Willison(Coauthor of Django), and
10 more.

LAVIS by salesforce

0.2%
11k
Library for language-vision AI research
Created 3 years ago
Updated 10 months ago
Feedback? Help us improve.