Research paper for table pre-training via neural SQL execution
Top 90.6% on sourcepulse
TAPEX is a pre-training approach designed to imbue generative language models with table reasoning capabilities. It targets researchers and practitioners working with structured data, offering state-of-the-art performance on table-based question answering tasks by learning to execute SQL queries over tables.
How It Works
TAPEX trains a model to mimic the process of executing SQL queries against a table. This is achieved by synthesizing a large corpus of (SQL query, flattened table, SQL execution result) tuples. The core idea is that by learning to faithfully execute SQL, the model develops a deep understanding of table structures and gains an inductive bias for reasoning over them. This approach allows for systematic generation of diverse and high-quality pre-training data.
Quick Start & Requirements
pip install --editable ./
(within a Python 3.8 virtual environment).fairseq
(version >= 0.12.0), Python 3.8 recommended.Highlighted Details
tapex.base
, tapex.large
) and fine-tuned weights for various datasets.Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
tapex-large
model experienced a bug related to bart-large
which may affect performance.2 years ago
Inactive