mona  by LeiyiHU

Adapter tuning research for visual recognition tasks

created 1 year ago
344 stars

Top 81.6% on sourcepulse

GitHubView on GitHub
Project Summary

Mona is a novel adapter-based tuning method designed to surpass the performance of full fine-tuning in visual recognition tasks, particularly for challenging areas like instance and semantic segmentation. It targets researchers and practitioners in computer vision seeking efficient and high-performing alternatives to traditional fine-tuning.

How It Works

Mona, or Multi-cognitive Visual Adapter, is a parameter-efficient tuning method that integrates multiple adapter modules. This approach is designed to enhance transfer learning efficiency and performance by offering a competitive alternative to full fine-tuning, aiming to break previous performance ceilings in delta-tuning methods.

Quick Start & Requirements

  • Installation: Requires setting up environments and preparing datasets as per Swin-Transformer-Object-Detection, Swin-Transformer-Semantic-Segmentation, and Swin-Transformer-Classification repositories.
  • Prerequisites: PyTorch, CUDA (implied for GPU training), and specific dataset formats (COCO, ADE20K, VOC, Oxford Flower, Oxford Pet). Pre-trained weights (IM22K-Supervised Swin-Base/Large) are required.
  • Training: Execute dist_train.sh scripts with modified configuration files specifying data_root and load_from paths.
  • Links: CVPR Homepage, arXiv

Highlighted Details

  • Outperforms full fine-tuning on COCO object detection (53.4 box AP, 46.0 mask AP with Swin-Base) and ADE20K semantic segmentation (51.36 mIoU with Swin-Large).
  • Achieves faster convergence compared to other delta-tuning methods.
  • Demonstrates the potential for adapter-tuning to replace full fine-tuning across various visual tasks.

Maintenance & Community

The project is associated with the CVPR 2025 conference. Key contributors include Dongshuo Yin, Leiyi Hu, and Bin Li. It acknowledges contributions from Swin-Transformer, mmclassification, NOAH, LoRA, and Adaptformer.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project is presented as a CVPR 2025 submission, suggesting it may be in a research or early-stage development phase. Specific limitations or unsupported platforms are not detailed in the README.

Health Check
Last commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
2
Star History
105 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.