LLM-Viewer by hahnyuan

LLM analysis tool for inference performance on diverse hardware

Created 1 year ago

584 stars

Top 55.5% on SourcePulse

Project Summary

LLM-Viewer is a tool designed to analyze and visualize the inference performance of Large Language Models (LLMs) across various hardware platforms. It targets researchers and engineers seeking to understand and optimize LLM inference by examining computation, memory, and transmission aspects through a user-friendly interface and a hardware roofline model.

How It Works

LLM-Viewer analyzes LLMs by gathering layer-specific information (computation, tensor shapes, dependencies), defining hardware capabilities (compute capacity, memory bandwidth), and configuring inference parameters (batch size, sequence length). It then applies a roofline model to estimate layer and network performance, track memory usage, and identify bottlenecks, providing insights into factors influencing inference.

Quick Start & Requirements

Install: pip install transformers flask flask_cors easydict
Prerequisites: Python 3.x, transformers library.
Usage: python3 analyze_cli.py <model_name> <hardware_name> [options]
Web Access: LLM-Viewer Web
Paper: LLM Inference Unveiled: Survey and Roofline Model Insights

Highlighted Details

Analyzes computation, storage, and transmission aspects of LLM inference.
Generates hardware roofline models for performance insights.
Supports analysis of Transformer and DiT models.
Provides detailed reports on performance bottlenecks and memory footprint.

Maintenance & Community

This is an ongoing project with a TODO list indicating planned enhancements. No specific community channels or contributor details are provided in the README.

Licensing & Compatibility

The repository is available on GitHub under an unspecified license. The README does not detail licensing restrictions or compatibility for commercial use.

Limitations & Caveats

The tool's time estimations represent theoretical performance and should only be used for relative comparisons. Support for LLMs and hardware configurations is expanding, with some features (e.g., non-transformer layers, full network visualization) still under development.

LLM-Viewer by hahnyuan

Explore Similar Projects

Awesome-KV-Cache-Management by TreeAI-Lab

ollama-benchmark by aidatatools

swiftLLM by interestingLSY

ScaleLLM by vectorch-ai

yalm by andrewkchan

llm-analysis by cli99

binary-mlc-llm-libs by mlc-ai

LiteRT-LM by google-ai-edge

ollm by Mega4alik

distributed-llama by b4rtaz

CTranslate2 by OpenNMT

FastDeploy by PaddlePaddle