Time-MoE by Time-MoE

Time series foundation model research paper using mixture of experts

Created 1 year ago

878 stars

Top 41.0% on SourcePulse

Project Summary

Time-MoE introduces a family of billion-scale, decoder-only time series foundation models utilizing a Mixture-of-Experts architecture. It addresses the need for universal forecasting across arbitrary horizons and context lengths, targeting researchers and practitioners in time series analysis. The primary benefit is enabling state-of-the-art performance on diverse time series tasks with a single, scalable model.

How It Works

Time-MoE employs a Mixture-of-Experts (MoE) design within a decoder-only transformer framework. This allows for efficient scaling to billions of parameters by selectively activating experts for different inputs. The auto-regressive nature enables flexible forecasting with context lengths up to 4096, making it suitable for complex time series patterns.

Quick Start & Requirements

Installation: pip install -r requirements.txt
Prerequisites: Python 3.10+, transformers==4.40.1. Optional: flash-attn for performance.
Dataset: Time-300B dataset available on Hugging Face.
Inference: Models available on Hugging Face (e.g., Maple728/TimeMoE-50M).
Documentation: Paper Page

Highlighted Details

First work to scale time series foundation models to 2.4 billion parameters.
Introduces Time-300B, a 300 billion time point dataset across 9 domains.
Achieved ICLR 2025 Spotlight (Top 5.1%).
Supports context lengths up to 4096.

Maintenance & Community

Project accepted to ICLR 2025 Spotlight.
Time-300B dataset and models released on Hugging Face.
Preprint available on arXiv.

Licensing & Compatibility

Licensed under Apache-2.0.
Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is still under active development with a TODO list including covariate support and fine-tuning for dynamic features. While models are available, full fine-tuning instructions for custom datasets are provided, implying potential complexity for users without pre-formatted data.

Health Check

Last Commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

35 stars in the last 30 days