mcp-pandoc  by vivekVells

Document conversion server using Pandoc

created 8 months ago
345 stars

Top 81.5% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an MCP server for document format conversion using Pandoc, targeting developers and users who need to transform content between various formats while preserving structure and formatting. It simplifies complex document conversions through a standardized protocol.

How It Works

The server leverages the Pandoc Python package to perform document conversions. It supports a wide range of input and output formats, including Markdown, HTML, PDF, DOCX, and more. Advanced features include the use of YAML configuration files for defining reusable conversion templates and applying custom Pandoc filters for enhanced processing.

Quick Start & Requirements

  • Installation: Published version can be installed via npx -y @smithery/cli install mcp-pandoc --client claude. Local development requires cloning the repository and configuring claude_desktop_config.json.
  • Prerequisites: Pandoc must be installed (brew install pandoc on macOS, apt-get install pandoc on Ubuntu/Debian, or download from pandoc.org). uv package is also required (brew install uv on macOS, pip install uv on Linux/Windows).
  • PDF Conversion: Requires TeX Live installation (brew install texlive on macOS, apt-get install texlive-xetex on Ubuntu/Debian, or MiKTeX/TeX Live on Windows).
  • File Paths: For advanced formats (PDF, DOCX, RST, LaTeX, EPUB), a complete output_file path including filename and extension is mandatory.
  • Documentation: A CHEATSHEET.md is available for examples and workflows.

Highlighted Details

  • Supports bidirectional conversion between Markdown, HTML, TXT, DOCX, RST, LaTeX, EPUB, IPYNB, and ODT.
  • PDF is an output-only format; conversion from PDF is not supported.
  • DOCX output supports custom styling via reference documents.
  • Configuration via YAML defaults files and Pandoc filters is supported.

Maintenance & Community

The project is in early development, with functionality subject to change. Contributions are welcomed via GitHub Issues and Pull Requests.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README.

Limitations & Caveats

PDF support is under development, and the functionality is subject to change. Converting to PDF requires TeX Live installation. Reference documents are only supported for DOCX output.

Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
1
Star History
158 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Dan Guido Dan Guido(Cofounder of Trail of Bits), and
8 more.

markitdown by microsoft

0.9%
70k
Python tool for converting files to Markdown for LLM text analysis
created 8 months ago
updated 2 months ago
Feedback? Help us improve.