keras_cv_attention_models  by leondgarse

Keras implementations for CV attention models

created 4 years ago
615 stars

Top 54.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a comprehensive collection of Keras implementations for various computer vision attention models, including architectures like ConvNeXt, Swin Transformers, and EfficientNet variants. It targets researchers and practitioners needing a unified library for experimenting with state-of-the-art vision models for tasks such as image classification, object detection, and segmentation. The library offers pre-trained weights and facilitates easy model building, training, and conversion.

How It Works

The library leverages Keras (with support for TensorFlow and PyTorch backends) to implement a wide array of modern vision architectures. It focuses on providing modular components and pre-built models, allowing users to quickly integrate and fine-tune these architectures. The project emphasizes ease of use with features like automatic weight loading, model surgery for modifications, and integrated training scripts for common benchmarks like ImageNet and COCO.

Quick Start & Requirements

  • Install via pip: pip install -U kecam or pip install -U keras-cv-attention-models
  • Backend: Requires TensorFlow or PyTorch to be installed.
  • For PyTorch backend: export KECAM_BACKEND='torch'
  • For TensorFlow backend with TF >= 2.16.0: pip install tf-keras~=$(pip show tensorflow | awk -F ': ' '/Version/{print $2}') and export TF_USE_LEGACY_KERAS=1 or import kecam before TensorFlow.
  • Official Docs: https://github.com/leondgarse/keras_cv_attention_models

Highlighted Details

  • Supports over 100 vision models, including recognition, detection, and language models.
  • Offers flexible backend support (TensorFlow, PyTorch, Keras Core).
  • Includes utilities for model surgery, FLOPs calculation, ONNX export, and TFLite conversion.
  • Provides detailed benchmarks and T4 inference performance metrics for many models.

Maintenance & Community

  • The project is actively maintained by Leondgarse.
  • Community support is available via GitHub issues.

Licensing & Compatibility

  • Code is licensed under MIT.
  • Pretrained weights may have restrictions based on their original dataset licenses (e.g., ImageNet for non-commercial research). Users should verify licenses for commercial use.

Limitations & Caveats

  • Not compatible with Keras 3.x.
  • COCO training and evaluation scripts are noted as "still under testing."
  • Some models (e.g., VOLO, HaloNet, NFNets) have limited PyTorch backend support.
  • TFLite conversion has limitations with certain operations and models (e.g., VOLO, HaloNet).
Health Check
Last commit

3 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
3 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.