Discover and explore top open-source AI tools and projects—updated daily.
Simulator for large-scale AI training analysis and optimization
Top 51.2% on SourcePulse
SimAI is a full-stack, high-precision simulator designed for analyzing and optimizing large-scale AI training, particularly for Large Language Models (LLMs). It targets researchers and engineers seeking to understand and improve training performance by modeling various layers of the training stack, from framework parameters to network topology.
How It Works
SimAI integrates four core components: AICB for workload modeling, SimCCL for collective communication analysis, astra-sim-alibabacloud for network simulation, and ns-3-alibabacloud for detailed network communication modeling. This modular design allows for flexible simulation scenarios, ranging from fast analytical estimations using bus bandwidth to high-fidelity, full-stack simulations that capture intricate network behaviors. The project leverages extensions from astra-sim and integrates NCCL algorithms for realistic performance evaluation.
Quick Start & Requirements
./scripts/build.sh -c analytical
or ./scripts/build.sh -c ns3
../bin/SimAI_analytical -w example/workload_analytical.txt -g 9216 -g_p_s 8 -r test- -busbw example/busbw.yaml
. Simulation mode requires network topology generation and specific environment variables.Highlighted Details
Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The "SimAI-Physical" mode is in beta and internal testing. The README does not specify the project's license, which could impact commercial adoption.
21 hours ago
1 day