Discover and explore top open-source AI tools and projects—updated daily.
microsoftLLM inference system simulator
Top 59.0% on SourcePulse
Vidur is a high-fidelity LLM inference system simulator designed for researchers and engineers. It enables detailed performance analysis, capacity planning, and rapid prototyping of new scheduling algorithms and optimizations without requiring direct GPU access for most testing.
How It Works
Vidur simulates LLM inference by modeling request arrival, scheduling, execution, and resource utilization. It supports various workload traces and synthetic request generation, allowing users to evaluate metrics like Time To First Token (TTFT) and Total Request Time. The simulator's extensibility allows for the integration of novel scheduling algorithms and optimization techniques, such as speculative decoding, offering a flexible platform for system-level LLM research.
Quick Start & Requirements
mamba env create -p ./env -f ./environment.yml or a venv environment with python -m pip install -r requirements.txt.wandb integration for logging.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The simulator's accuracy is dependent on the fidelity of its execution time predictor, which may require initial profiling on target hardware. Support for specific hardware configurations (e.g., H100, 8xA40) is not universal across all models.
7 months ago
1 day
modal-labs
SqueezeAILab
ModelTC
OpenBMB