Industrial anomaly detection via vision-language model
Top 38.4% on sourcepulse
AnomalyGPT is a novel approach to industrial anomaly detection (IAD) that leverages Large Vision-Language Models (LVLMs) to identify defects in images without requiring manual threshold tuning. It is designed for researchers and practitioners in industrial quality control seeking a more intuitive and descriptive anomaly detection system.
How It Works
AnomalyGPT aligns industrial images with textual descriptions using a pre-trained image encoder and a Large Language Model (LLM). It employs a lightweight, visual-textual feature-matching decoder for localization and a prompt learner to inject fine-grained semantic information into the LLM, which is then fine-tuned. This method allows for anomaly detection and localization, along with descriptive insights, and can generalize to unseen items with minimal normal samples.
Quick Start & Requirements
pip install -r requirements.txt
imagebind_huge.pth
)openllmplayground/pandagpt_7b_max_len_1024
)openllmplayground/pandagpt_7b_max_len_1024
)AnomalyGPT/train_supervised
)python web_demo.py
after setup.Highlighted Details
Maintenance & Community
The project is associated with CASIA-IVA-Lab. Further community or maintenance details are not explicitly provided in the README.
Licensing & Compatibility
Limitations & Caveats
The setup requires downloading multiple large checkpoints and datasets, which can be time-consuming and resource-intensive. The CC BY-NC-SA 4.0 license restricts commercial use.
1 year ago
1 day