ddddocr  by 86maid

Rust OCR toolkit for CAPTCHA recognition and more

Created 2 years ago
273 stars

Top 94.7% on SourcePulse

GitHubView on GitHub
Project Summary

Summary

86maid/ddddocr provides a Rust implementation for Optical Character Recognition (OCR) and CAPTCHA solving, offering a cross-platform, OpenCV-free solution. It includes a versatile OCR library and a deployable API server, benefiting developers seeking efficient and easily integrated image recognition capabilities, with optional GPU acceleration via CUDA.

How It Works

This project leverages Rust for high performance and cross-platform compatibility, avoiding external dependencies like OpenCV. It implements core OCR functionalities, including content recognition with various models and color filtering, alongside specialized modules for target detection and non-neural network-based slide puzzle matching. The architecture supports both direct library integration and a standalone ocr_api_server for simplified deployment, with an optional CUDA feature for GPU-accelerated inference.

Quick Start & Requirements

Pre-compiled binary versions are available for quick deployment. Alternatively, the library can be added via Cargo (ddddocr = {git = "https://github.com/86maid/ddddocr.git", branch = "master"}). For GPU acceleration, enable the cuda feature (features = ["cuda"]), which requires manual setup of CUDA and cuDNN (e.g., CUDA 12 with cuDNN 9.x). Building from source may require specific C++ runtimes (Windows) or glibc versions (Linux). Detailed troubleshooting is available at ort.pyke.io.

Highlighted Details

  • Cross-Platform: Supports Windows (64/32-bit), Linux (64/ARM64), and macOS (x64, M1/M2/M3).
  • No OpenCV: Eliminates a common, heavy dependency.
  • GPU Acceleration: Optional CUDA support for enhanced performance.
  • API Server: Includes a simple, deployable ocr_api_server with RESTful endpoints.
  • AI MCP Protocol: Supports integration with AI Agents via the Model Context Protocol.
  • Advanced Features: Offers OCR probability output, customizable character ranges, color filtering, and multiple slide-matching algorithms.
  • Custom Models: Supports importing custom ONNX models trained with dddd_trainer.

Maintenance & Community

Information regarding maintainers, community channels (e.g., Discord, Slack), or project roadmaps is not explicitly detailed in the provided text.

Licensing & Compatibility

The project's license is not specified. This requires clarification for commercial use or integration into proprietary software.

Limitations & Caveats

32-bit Linux is unsupported. The CUDA feature necessitates specific environment setup and does not support static linking. Building from source may encounter OS-specific dependency issues (e.g., glibc, vc++ runtimes). The absence of a specified license is a significant adoption blocker.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
8 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.