ITFormer-ICML25  by Pandalin98

Framework for temporal-textual multimodal question answering

Created 3 months ago
370 stars

Top 76.3% on SourcePulse

GitHubView on GitHub
Project Summary

ITFormer is an open-source framework for temporal-textual multimodal question answering, designed to bridge time series data with natural language understanding. It targets researchers and practitioners in multimodal AI, offering a lightweight yet high-performing solution that claims to outperform ChatGPT-4o on temporal QA tasks.

How It Works

ITFormer utilizes a novel "Instruct Time Transformer" architecture that combines advanced temporal reasoning with multimodal understanding. It effectively processes and integrates time series data with textual context, enabling comprehensive analysis for QA. The framework's advantage lies in its ability to achieve state-of-the-art results with efficient, deployable models.

Quick Start & Requirements

Highlighted Details

  • Offers pre-trained models of varying sizes (0.5B, 3B, 7B).
  • The 0.5B model reportedly outperforms ChatGPT-4o in temporal QA.
  • Includes the large-scale EngineMT-QA dataset (118K+ samples).
  • Achieves state-of-the-art results on temporal-textual QA benchmarks.

Maintenance & Community

The project is associated with ICML 2025 and appears to be an official implementation. Contact is via GitHub issues.

Licensing & Compatibility

MIT License. Permissive for commercial use and closed-source linking.

Limitations & Caveats

The README details setup for inference with pre-trained models; training functionality is not explicitly mentioned. Performance claims are based on the ICML 2025 paper.

Health Check
Last Commit

2 weeks ago

Responsiveness

Inactive

Pull Requests (30d)
3
Issues (30d)
4
Star History
35 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.