Discover and explore top open-source AI tools and projects—updated daily.
Framework for temporal-textual multimodal question answering
Top 76.3% on SourcePulse
ITFormer is an open-source framework for temporal-textual multimodal question answering, designed to bridge time series data with natural language understanding. It targets researchers and practitioners in multimodal AI, offering a lightweight yet high-performing solution that claims to outperform ChatGPT-4o on temporal QA tasks.
How It Works
ITFormer utilizes a novel "Instruct Time Transformer" architecture that combines advanced temporal reasoning with multimodal understanding. It effectively processes and integrates time series data with textual context, enabling comprehensive analysis for QA. The framework's advantage lies in its ability to achieve state-of-the-art results with efficient, deployable models.
Quick Start & Requirements
git-lfs
for large file downloads.git-lfs
. Requires downloading base LLM models (Qwen2.5-Instruct) and ITFormer checkpoints (0.5B, 3B, or 7B).infer.yaml
with data and model paths.python inference.py --config yaml/infer.yaml
with optional --model_checkpoint
argument.Highlighted Details
Maintenance & Community
The project is associated with ICML 2025 and appears to be an official implementation. Contact is via GitHub issues.
Licensing & Compatibility
MIT License. Permissive for commercial use and closed-source linking.
Limitations & Caveats
The README details setup for inference with pre-trained models; training functionality is not explicitly mentioned. Performance claims are based on the ICML 2025 paper.
2 weeks ago
Inactive