docling-mcp  by docling-project

Agentic document processing service

Created 7 months ago
268 stars

Top 95.7% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Docling MCP provides an agentic document processing service, enabling AI clients to interact with document conversion, processing, and generation tools via the Message Control Protocol (MCP). It targets developers building RAG applications and users of AI desktop clients, offering enhanced document manipulation capabilities and improved performance through caching.

How It Works

The service leverages the Docling library to convert PDF documents into a structured JSON format (DoclingDocument). Functionality is exposed as MCP tools, allowing client applications to programmatically control document operations. A local caching mechanism is implemented to accelerate repeated access to documents, and the system includes memory management for handling large files and a logging system for monitoring.

Quick Start & Requirements

Installation and execution are primarily handled via uvx. The core command is uvx --from docling-mcp docling-mcp-server, with the --transport argument specifying the protocol (e.g., stdio, sse, streamable-http). For development, refer to docs/development.md. Integration with AI desktop clients often involves adding a specific configuration entry to client settings files, with examples provided for Claude for Desktop and LM Studio.

Highlighted Details

  • Conversion Tools: Enables PDF document conversion to a structured JSON format (DoclingDocument).
  • Generation Tools: Supports creating documents within the DoclingDocument structure, exportable to various formats.
  • Caching: Implements local document caching for performance optimization.
  • RAG Support: Facilitates Retrieval-Augmented Generation (RAG) applications, including Milvus upload and retrieval.
  • Client Integration: Designed for seamless integration with MCP-enabled AI desktop clients.

Maintenance & Community

Docling MCP is hosted as a project under the LF AI & Data Foundation. The project originated from the AI for knowledge team at IBM Research Zurich.

Licensing & Compatibility

The Docling MCP codebase is distributed under the MIT license. Users should consult the licenses of individual models used within the system for specific restrictions, particularly concerning commercial use or integration into closed-source applications.

Limitations & Caveats

The provided documentation does not explicitly detail limitations, unsupported platforms, or known bugs. Further investigation into the docs/development.md and ./docs/integrations/ directories may reveal more specific caveats.

Health Check
Last Commit

1 week ago

Responsiveness

Inactive

Pull Requests (30d)
5
Issues (30d)
2
Star History
73 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.