Image captcha solver (digits, text, rotation, object similarity)
Top 93.5% on sourcepulse
This repository provides a collection of deep learning-based solutions for various image CAPTCHA recognition tasks, targeting developers and researchers aiming to automate or analyze CAPTCHA systems. It offers practical implementations for fixed-length text, sliding puzzles, point-and-click text, rotation, and similar object CAPTCHAs, aiming to bypass or understand these security measures.
How It Works
The project leverages several deep learning techniques tailored to specific CAPTCHA types. For fixed-length text CAPTCHAs, it uses CNNs trained on custom datasets with specific naming conventions. Sliding CAPTCHAs are addressed using either OpenCV's template matching for simplicity or YOLOv5 for more robust object detection of the slider gap. Point-and-click CAPTCHAs involve object detection (YOLOv5) to locate candidate characters, followed by matching strategies that may include OCR or Siamese networks for image-based character comparison. Rotation CAPTCHAs are tackled via regression (predicting rotation angle) or classification, often using ResNet50 for feature extraction. Similar object CAPTCHAs utilize YOLOv5 for object detection and classification.
Quick Start & Requirements
pip install -r requirements.txt
(comprehensive list, install as needed).Highlighted Details
labelme_json_to_yolov5_format.py
).Maintenance & Community
The repository is maintained by anexplore. No specific community channels (Discord, Slack) or roadmap links are provided in the README.
Licensing & Compatibility
The README does not explicitly state a license. The presence of requirements.txt
suggests compatibility with standard Python environments. Commercial use would require clarification on licensing.
Limitations & Caveats
Performance is heavily dependent on the size and quality of the training dataset for each CAPTCHA type. Some methods, like OCR for deformed text, may not be effective. The project notes that large models, while promising, currently have slower inference speeds and higher resource requirements, and may require fine-tuning for optimal performance.
11 months ago
1 day