Time series foundation model research paper using mixture of experts
Top 49.4% on sourcepulse
Time-MoE introduces a family of billion-scale, decoder-only time series foundation models utilizing a Mixture-of-Experts architecture. It addresses the need for universal forecasting across arbitrary horizons and context lengths, targeting researchers and practitioners in time series analysis. The primary benefit is enabling state-of-the-art performance on diverse time series tasks with a single, scalable model.
How It Works
Time-MoE employs a Mixture-of-Experts (MoE) design within a decoder-only transformer framework. This allows for efficient scaling to billions of parameters by selectively activating experts for different inputs. The auto-regressive nature enables flexible forecasting with context lengths up to 4096, making it suitable for complex time series patterns.
Quick Start & Requirements
pip install -r requirements.txt
transformers==4.40.1
. Optional: flash-attn
for performance.Maple728/TimeMoE-50M
).Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The project is still under active development with a TODO list including covariate support and fine-tuning for dynamic features. While models are available, full fine-tuning instructions for custom datasets are provided, implying potential complexity for users without pre-formatted data.
1 month ago
1 day