AnomalyGPT  by CASIA-IVA-Lab

Industrial anomaly detection via vision-language model

created 1 year ago
984 stars

Top 38.4% on sourcepulse

GitHubView on GitHub
Project Summary

AnomalyGPT is a novel approach to industrial anomaly detection (IAD) that leverages Large Vision-Language Models (LVLMs) to identify defects in images without requiring manual threshold tuning. It is designed for researchers and practitioners in industrial quality control seeking a more intuitive and descriptive anomaly detection system.

How It Works

AnomalyGPT aligns industrial images with textual descriptions using a pre-trained image encoder and a Large Language Model (LLM). It employs a lightweight, visual-textual feature-matching decoder for localization and a prompt learner to inject fine-grained semantic information into the LLM, which is then fine-tuned. This method allows for anomaly detection and localization, along with descriptive insights, and can generalize to unseen items with minimal normal samples.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Prerequisites:
    • ImageBind checkpoint (imagebind_huge.pth)
    • Vicuna checkpoint (e.g., openllmplayground/pandagpt_7b_max_len_1024)
    • PandaGPT delta weights (e.g., openllmplayground/pandagpt_7b_max_len_1024)
    • AnomalyGPT weights (e.g., AnomalyGPT/train_supervised)
    • MVTec-AD dataset, VisA dataset, and PandaGPT pre-training data.
  • Demo: Run python web_demo.py after setup.
  • Links: Project Page, Online Demo, Paper

Highlighted Details

  • First LVLM-based IAD method.
  • Detects anomalies without manual thresholding.
  • Provides anomaly location and descriptive information.
  • Capable of few-shot detection for unseen items.
  • Trained on MVTec-AD, VisA, MVTec-LOCO-AD, and CrackForest datasets.

Maintenance & Community

The project is associated with CASIA-IVA-Lab. Further community or maintenance details are not explicitly provided in the README.

Licensing & Compatibility

  • License: CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International).
  • Compatibility: The non-commercial clause restricts use in commercial products.

Limitations & Caveats

The setup requires downloading multiple large checkpoints and datasets, which can be time-consuming and resource-intensive. The CC BY-NC-SA 4.0 license restricts commercial use.

Health Check
Last commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
67 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.