OCR tool using QwenLM
Top 98.6% on sourcepulse
This project provides an OCR interface leveraging QwenLM's capabilities, designed for users needing to extract text from images, including specialized support for mathematical formulas and CAPTCHAs. It offers a user-friendly web UI with drag-and-drop and clipboard support, alongside API access for programmatic integration.
How It Works
The system acts as a proxy, interacting with the QwenLM API to perform OCR. It handles image uploads, text extraction, and crucially, formats the output according to a detailed prompt that prioritizes LaTeX for mathematical content and specific rules for CAPTCHAs. This approach allows users to benefit from QwenLM's advanced models without direct API key management, while the prompt engineering ensures structured and usable output.
Quick Start & Requirements
docker run -p 3000:3000 sexgirls/qwen-ocr-app:latest
worker.js
to Cloudflare.Highlighted Details
Maintenance & Community
The project is actively maintained by Cunninger. Further community engagement details (like Discord/Slack) are not explicitly mentioned in the README.
Licensing & Compatibility
Limitations & Caveats
The project relies on external QwenLM API access, which may be subject to rate limits or changes. The provided test cookies have upload limitations, necessitating the use of personal cookies for stable operation. The API documentation link provided points to an Apifox page that notes potential debugging issues.
2 months ago
1 day