Time-MoE  by Time-MoE

Time series foundation model research paper using mixture of experts

created 10 months ago
708 stars

Top 49.4% on sourcepulse

GitHubView on GitHub
Project Summary

Time-MoE introduces a family of billion-scale, decoder-only time series foundation models utilizing a Mixture-of-Experts architecture. It addresses the need for universal forecasting across arbitrary horizons and context lengths, targeting researchers and practitioners in time series analysis. The primary benefit is enabling state-of-the-art performance on diverse time series tasks with a single, scalable model.

How It Works

Time-MoE employs a Mixture-of-Experts (MoE) design within a decoder-only transformer framework. This allows for efficient scaling to billions of parameters by selectively activating experts for different inputs. The auto-regressive nature enables flexible forecasting with context lengths up to 4096, making it suitable for complex time series patterns.

Quick Start & Requirements

  • Installation: pip install -r requirements.txt
  • Prerequisites: Python 3.10+, transformers==4.40.1. Optional: flash-attn for performance.
  • Dataset: Time-300B dataset available on Hugging Face.
  • Inference: Models available on Hugging Face (e.g., Maple728/TimeMoE-50M).
  • Documentation: Paper Page

Highlighted Details

  • First work to scale time series foundation models to 2.4 billion parameters.
  • Introduces Time-300B, a 300 billion time point dataset across 9 domains.
  • Achieved ICLR 2025 Spotlight (Top 5.1%).
  • Supports context lengths up to 4096.

Maintenance & Community

  • Project accepted to ICLR 2025 Spotlight.
  • Time-300B dataset and models released on Hugging Face.
  • Preprint available on arXiv.

Licensing & Compatibility

  • Licensed under Apache-2.0.
  • Permissive license suitable for commercial use and integration into closed-source projects.

Limitations & Caveats

The project is still under active development with a TODO list including covariate support and fine-tuning for dynamic features. While models are available, full fine-tuning instructions for custom datasets are provided, implying potential complexity for users without pre-formatted data.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
5
Star History
130 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai), and
10 more.

TinyLlama by jzhang38

0.3%
9k
Tiny pretraining project for a 1.1B Llama model
created 1 year ago
updated 1 year ago
Feedback? Help us improve.