Table extraction library (deprecated, functionality moved to `marker`)
Top 47.3% on sourcepulse
This library extracts tables from PDFs and images into Markdown, CSV, or HTML formats. It's designed for researchers and developers needing to process tabular data embedded in documents, offering automated detection, layout analysis, and cell formatting.
How It Works
Tabled leverages the Surya library for initial table detection within documents. It then employs a layout analysis model to identify rows and columns, followed by a recognition model to extract and format cell content. This multi-stage approach aims for high accuracy in parsing complex table structures.
Quick Start & Requirements
pip install tabled-pdf
Highlighted Details
Maintenance & Community
marker
.Licensing & Compatibility
Limitations & Caveats
The project is officially deprecated, recommending migration to marker
for continued development and support.
6 months ago
1 day