Document AI pipeline for RAG and model training
Top 16.3% on SourcePulse
This library orchestrates document layout analysis and extraction for RAG, targeting researchers and developers building Document AI pipelines. It offers a unified framework for training, evaluating, and inferencing models, simplifying complex document understanding tasks.
How It Works
Deepdoctection integrates multiple state-of-the-art models for layout analysis, OCR, and document classification. It leverages PyTorch with Detectron2 and Transformers, or TensorFlow with Tensorpack, for core vision tasks. For OCR, it supports Tesseract, DocTr, and AWS Textract. Document and token classification are handled by LayoutLM family models, LiLT, and BERT-style architectures, incorporating features like sliding windows. Additional utilities include text mining for PDFs, language detection, and image deskewing.
Quick Start & Requirements
pip install deepdoctection
(with [pt]
or [tf]
for full features).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
Windows is not officially supported, though a Docker solution exists. TensorFlow support is being phased out for newer Python versions.
2 weeks ago
Inactive