sparrow  by katanaml

Data processing & instruction calling tool using ML, LLM, and Vision LLM

Created 3 years ago
4,988 stars

Top 10.0% on SourcePulse

GitHubView on GitHub
Project Summary

Sparrow is an open-source framework for universal document processing, offering data extraction, instruction calling, and workflow orchestration powered by ML, LLMs, and Vision LLMs. It targets developers and power users needing to automate the extraction and processing of structured data from diverse document types like invoices, receipts, and tables, providing a flexible, API-first solution.

How It Works

Sparrow employs a pluggable architecture with distinct pipelines: Sparrow Parse for Vision LLM-based document extraction, Sparrow Instructor for text-based instruction processing, and Sparrow Agents for orchestrating complex multi-step workflows. It supports multiple backends including MLX (Apple Silicon), Ollama, vLLM, PyTorch, and Hugging Face Cloud GPUs, enabling flexible deployment and hardware optimization. The system extracts data into structured JSON format, with optional schema validation and bounding box annotations.

Quick Start & Requirements

  • Install: Clone the repository, set up Python 3.10.4+ via pyenv, create virtual environments, and install pipeline-specific requirements (e.g., pip install -r requirements_sparrow_parse.txt).
  • Prerequisites: macOS or Linux/Windows. poppler for PDF processing (e.g., brew install poppler on macOS). GPU recommended for performance.
  • Demo: Try Sparrow Online at sparrow.katanaml.io.
  • Docs: Detailed setup guide and API documentation available.

Highlighted Details

  • Universal document processing for invoices, receipts, forms, bank statements, and tables.
  • Pluggable architecture supporting Sparrow Parse (Vision LLM), Instructor (Text LLM), and Agents (Workflows).
  • Multiple backends: MLX (Apple Silicon), Ollama, vLLM, PyTorch, Hugging Face Cloud GPU.
  • API-first design with RESTful APIs and interactive Swagger documentation.
  • Sparrow Agent for complex workflow orchestration with visual monitoring via Prefect.

Maintenance & Community

The project is led by Andrej Baranovskij and Katana ML. Community support is available via GitHub Issues. Commercial support and licensing are offered via abaranovskis@redsamuraiconsulting.com.

Licensing & Compatibility

Licensed under GPL 3.0, free for open-source projects and organizations under $5M revenue. Dual licensing is available for proprietary use, enterprise features, and dedicated support.

Limitations & Caveats

Performance on CPU-only configurations is significantly slower. While MLX is optimized for Apple Silicon, other backends may require specific GPU/CUDA setups. The README mentions "Enterprise Ready" features like rate limiting and usage analytics, but details on their implementation are not immediately apparent.

Health Check
Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
1
Star History
36 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.