AnomalyGPT by CASIA-LMC-Lab

Industrial anomaly detection via vision-language model

Created 2 years ago

1,070 stars

Top 35.2% on SourcePulse

Project Summary

AnomalyGPT is a novel approach to industrial anomaly detection (IAD) that leverages Large Vision-Language Models (LVLMs) to identify defects in images without requiring manual threshold tuning. It is designed for researchers and practitioners in industrial quality control seeking a more intuitive and descriptive anomaly detection system.

How It Works

AnomalyGPT aligns industrial images with textual descriptions using a pre-trained image encoder and a Large Language Model (LLM). It employs a lightweight, visual-textual feature-matching decoder for localization and a prompt learner to inject fine-grained semantic information into the LLM, which is then fine-tuned. This method allows for anomaly detection and localization, along with descriptive insights, and can generalize to unseen items with minimal normal samples.

Quick Start & Requirements

Install: pip install -r requirements.txt
Prerequisites:
- ImageBind checkpoint (imagebind_huge.pth)
- Vicuna checkpoint (e.g., openllmplayground/pandagpt_7b_max_len_1024)
- PandaGPT delta weights (e.g., openllmplayground/pandagpt_7b_max_len_1024)
- AnomalyGPT weights (e.g., AnomalyGPT/train_supervised)
- MVTec-AD dataset, VisA dataset, and PandaGPT pre-training data.
Demo: Run python web_demo.py after setup.
Links: Project Page, Online Demo, Paper

Highlighted Details

First LVLM-based IAD method.
Detects anomalies without manual thresholding.
Provides anomaly location and descriptive information.
Capable of few-shot detection for unseen items.
Trained on MVTec-AD, VisA, MVTec-LOCO-AD, and CrackForest datasets.

Maintenance & Community

The project is associated with CASIA-IVA-Lab. Further community or maintenance details are not explicitly provided in the README.

Licensing & Compatibility

License: CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International).
Compatibility: The non-commercial clause restricts use in commercial products.

Limitations & Caveats

The setup requires downloading multiple large checkpoints and datasets, which can be time-consuming and resource-intensive. The CC BY-NC-SA 4.0 license restricts commercial use.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days