sparrow  by katanaml

Data processing & instruction calling tool using ML, LLM, and Vision LLM

created 3 years ago
4,930 stars

Top 10.3% on sourcepulse

GitHubView on GitHub
Project Summary

Sparrow is an open-source framework for universal document processing, offering data extraction, instruction calling, and workflow orchestration powered by ML, LLMs, and Vision LLMs. It targets developers and power users needing to automate the extraction and processing of structured data from diverse document types like invoices, receipts, and tables, providing a flexible, API-first solution.

How It Works

Sparrow employs a pluggable architecture with distinct pipelines: Sparrow Parse for Vision LLM-based document extraction, Sparrow Instructor for text-based instruction processing, and Sparrow Agents for orchestrating complex multi-step workflows. It supports multiple backends including MLX (Apple Silicon), Ollama, vLLM, PyTorch, and Hugging Face Cloud GPUs, enabling flexible deployment and hardware optimization. The system extracts data into structured JSON format, with optional schema validation and bounding box annotations.

Quick Start & Requirements

  • Install: Clone the repository, set up Python 3.10.4+ via pyenv, create virtual environments, and install pipeline-specific requirements (e.g., pip install -r requirements_sparrow_parse.txt).
  • Prerequisites: macOS or Linux/Windows. poppler for PDF processing (e.g., brew install poppler on macOS). GPU recommended for performance.
  • Demo: Try Sparrow Online at sparrow.katanaml.io.
  • Docs: Detailed setup guide and API documentation available.

Highlighted Details

  • Universal document processing for invoices, receipts, forms, bank statements, and tables.
  • Pluggable architecture supporting Sparrow Parse (Vision LLM), Instructor (Text LLM), and Agents (Workflows).
  • Multiple backends: MLX (Apple Silicon), Ollama, vLLM, PyTorch, Hugging Face Cloud GPU.
  • API-first design with RESTful APIs and interactive Swagger documentation.
  • Sparrow Agent for complex workflow orchestration with visual monitoring via Prefect.

Maintenance & Community

The project is led by Andrej Baranovskij and Katana ML. Community support is available via GitHub Issues. Commercial support and licensing are offered via abaranovskis@redsamuraiconsulting.com.

Licensing & Compatibility

Licensed under GPL 3.0, free for open-source projects and organizations under $5M revenue. Dual licensing is available for proprietary use, enterprise features, and dedicated support.

Limitations & Caveats

Performance on CPU-only configurations is significantly slower. While MLX is optimized for Apple Silicon, other backends may require specific GPU/CUDA setups. The README mentions "Enterprise Ready" features like rate limiting and usage analytics, but details on their implementation are not immediately apparent.

Health Check
Last commit

4 weeks ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
2
Star History
438 stars in the last 90 days

Explore Similar Projects

Starred by Jeff Hammerbacher Jeff Hammerbacher(Cofounder of Cloudera), Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake), and
2 more.

gpustack by gpustack

1.6%
3k
GPU cluster manager for AI model deployment
created 1 year ago
updated 2 days ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Tim J. Baek Tim J. Baek(Founder of Open WebUI), and
2 more.

llmware by llmware-ai

0.2%
14k
Framework for enterprise RAG pipelines using small, specialized models
created 1 year ago
updated 1 week ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Nat Friedman Nat Friedman(Former CEO of GitHub), and
32 more.

llama.cpp by ggml-org

0.4%
84k
C/C++ library for local LLM inference
created 2 years ago
updated 14 hours ago
Feedback? Help us improve.