Discover and explore top open-source AI tools and projects—updated daily.
weAIDBAnswering questions over complex semi-structured tables
Top 93.8% on SourcePulse
ST-Raptor is a tool designed for answering natural language questions over semi-structured tables, supporting diverse formats like HTML, CSV, and Markdown. It targets users who need precise answers from complex tables without the overhead of fine-tuning existing models. The primary benefit is its ability to handle intricate table layouts and integrate flexibly with various LLMs and VLMs, offering a no-fine-tuning approach.
How It Works
ST-Raptor employs a novel approach combining a Vision-Language Model (VLM) with a hierarchical organization (HO-Tree) construction algorithm. This VLM-LLM integration allows it to interpret complex table structures and extract relevant information. A two-stage validation mechanism is utilized to ensure the reliability and accuracy of the generated answers. The system's advantage lies in its ability to process tables without requiring task-specific fine-tuning, making it adaptable to new datasets and table types.
Quick Start & Requirements
Installation involves cloning the repository, setting up a conda environment (conda create -n straptor python=3.10, conda activate straptor, pip install -r requirements.txt), and installing wkhtmltox and font packages (fonts-noto-cjk, fonts-wqy-microhei). Model configuration requires significant resources: the recommended setup (Deepseek-V3, InternVL2.5 26B, Multilingual-E5-Large-Instruct) demands approximately 160GB of GPU memory. Alternatively, API calls can be configured for LLM, VLM, and Embedding models. Configuration details for models and API endpoints are managed in ./utils/constnts.py.
Highlighted Details
Maintenance & Community
The project maintains an active community through a WeChat group for discussions on complex semi-structured table analysis.
Licensing & Compatibility
ST-Raptor is released under the MIT License, permitting broad use and compatibility with closed-source applications.
Limitations & Caveats
The project roadmap indicates planned support for image inputs and expansion of the table extraction module to handle table types beyond the current scope. The high GPU memory requirement (160GB) for the recommended local model configuration presents a significant barrier to entry for users without substantial hardware resources.
1 day ago
Inactive
cfahlgren1
microsoft
xlang-ai
google-research