Ollama-OCR  by imanoop7

OCR package for extracting text from images/PDFs using vision language models via Ollama

created 8 months ago
1,603 stars

Top 26.7% on sourcepulse

GitHubView on GitHub
Project Summary

This package provides Optical Character Recognition (OCR) capabilities by leveraging state-of-the-art vision-language models through Ollama. It targets developers and users needing to extract text from images and PDFs, offering both a Python library and a Streamlit web application for flexible integration and use.

How It Works

The core approach utilizes Ollama to serve various vision-language models (LLaVA, Granite3.2-vision, Moondream, Minicpm-v). Users select a model and can process single images or batches, with options for custom prompts, output formats (Markdown, Plain Text, JSON, Structured, Key-Value, Table), and language specification. Image preprocessing is also supported.

Quick Start & Requirements

  • Installation: pip install ollama-ocr
  • Prerequisites: Ollama must be installed and running. Required models need to be pulled via Ollama (e.g., ollama pull llama3.2-vision:11b).
  • Resources: Requires Ollama and downloaded vision models.
  • Docs: Ollama OCR on Colab, Example Notebook

Highlighted Details

  • Supports PDF and image files.
  • Offers multiple output formats including structured data and tables.
  • Includes batch processing with parallel workers and progress tracking.
  • Provides a Streamlit web application with a drag-and-drop interface.
  • Allows custom prompts and language specification for enhanced accuracy.

Maintenance & Community

  • No specific contributors, sponsorships, or roadmap details are highlighted in the README.

Licensing & Compatibility

  • MIT License. Permissive for commercial use and integration with closed-source projects.

Limitations & Caveats

The README notes that the LLaVA model can sometimes generate incorrect output. Specific performance benchmarks or detailed error handling mechanisms are not provided.

Health Check
Last commit

4 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
1
Star History
125 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.