Awesome_Matching_Pretraining_Transfering  by Paranioar

Curated paper list for multimodal AI research

Created 4 years ago
432 stars

Top 68.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository serves as a curated list of resources and research papers focused on large multi-modality models (LMMM), parameter-efficient fine-tuning (PEFT), and vision-language pretraining (VLP). It aims to provide preliminary insights and a structured overview of these rapidly evolving fields for researchers and practitioners.

How It Works

The project organizes academic papers and related resources into distinct categories, including LMMM (further broken down by perception, generation, and unification), PEFT methods (like prompt tuning, adapter tuning), VLP (image-language and video-language pretraining), and conventional image-text matching techniques. This categorization facilitates a structured exploration of the landscape, highlighting key concepts, datasets, and learning paradigms.

Quick Start & Requirements

This repository is a curated list of papers and does not contain executable code. No installation or specific requirements are necessary to browse the content.

Highlighted Details

  • Comprehensive coverage of Large Multi-Modality Models, Parameter-Efficient Finetuning, Vision-Language Pretraining, and Image-Text Matching.
  • Includes sections on related surveys, benchmarks, and datasets for each topic.
  • Lists various PEFT techniques such as Prompt Tuning, Adapter Tuning, and Partially Tuning.
  • Covers conventional image-text matching approaches including Generic-Feature Extraction and Cross-Modal Interaction.

Maintenance & Community

The project is maintained by Paranioar and updates are logged, with the last update noted on 2024.12.15. Contact is available via email at r1228240468@gmail.com.

Licensing & Compatibility

The repository is released under the MIT license, permitting broad use and modification.

Limitations & Caveats

The project is a static list of papers and does not provide implementations or code. Updates to the LMMM section are ongoing and may not be fully comprehensive as of the last log entry.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
5 stars in the last 30 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Wing Lian Wing Lian(Founder of Axolotl AI), and
10 more.

open_flamingo by mlfoundations

0.1%
4k
Open-source framework for training large multimodal models
Created 3 years ago
Updated 1 year ago
Feedback? Help us improve.