LLM-Viewer  by hahnyuan

LLM analysis tool for inference performance on diverse hardware

created 1 year ago
521 stars

Top 61.5% on sourcepulse

GitHubView on GitHub
Project Summary

LLM-Viewer is a tool designed to analyze and visualize the inference performance of Large Language Models (LLMs) across various hardware platforms. It targets researchers and engineers seeking to understand and optimize LLM inference by examining computation, memory, and transmission aspects through a user-friendly interface and a hardware roofline model.

How It Works

LLM-Viewer analyzes LLMs by gathering layer-specific information (computation, tensor shapes, dependencies), defining hardware capabilities (compute capacity, memory bandwidth), and configuring inference parameters (batch size, sequence length). It then applies a roofline model to estimate layer and network performance, track memory usage, and identify bottlenecks, providing insights into factors influencing inference.

Quick Start & Requirements

Highlighted Details

  • Analyzes computation, storage, and transmission aspects of LLM inference.
  • Generates hardware roofline models for performance insights.
  • Supports analysis of Transformer and DiT models.
  • Provides detailed reports on performance bottlenecks and memory footprint.

Maintenance & Community

This is an ongoing project with a TODO list indicating planned enhancements. No specific community channels or contributor details are provided in the README.

Licensing & Compatibility

The repository is available on GitHub under an unspecified license. The README does not detail licensing restrictions or compatibility for commercial use.

Limitations & Caveats

The tool's time estimations represent theoretical performance and should only be used for relative comparisons. Support for LLMs and hardware configurations is expanding, with some features (e.g., non-transformer layers, full network visualization) still under development.

Health Check
Last commit

10 months ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
73 stars in the last 90 days

Explore Similar Projects

Starred by Ying Sheng Ying Sheng(Author of SGLang) and Stas Bekman Stas Bekman(Author of Machine Learning Engineering Open Book; Research Engineer at Snowflake).

llm-analysis by cli99

0%
441
CLI tool for LLM latency/memory analysis during training/inference
created 2 years ago
updated 3 months ago
Feedback? Help us improve.