Discover and explore top open-source AI tools and projects—updated daily.
gpustackAnalyze GGUF models and estimate inference resources
Top 98.2% on SourcePulse
<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> This project provides a Go-based utility for inspecting GGUF model files without requiring full downloads. It offers precise estimations of memory usage and maximum tokens per second (TPS), enabling users to quickly evaluate and plan for model deployment. The tool is designed for ML engineers and researchers working with large language models in the GGUF format.
How It Works
The gguf-parser-go leverages chunked reading to efficiently parse metadata from remote GGUF files, eliminating the need to download entire models. This approach significantly speeds up the initial assessment process. Written in Go, the tool benefits from the language's inherent performance and concurrency capabilities. It can also estimate the maximum tokens per second (TPS) by analyzing provided device metrics (CPU/GPU FLOPS and bandwidth), offering a predictive performance benchmark. Furthermore, it categorizes GGUF files by their intended use, such as embedding, reranking, or general model inference.
Quick Start & Requirements
Highlighted Details
Maintenance & Community
The provided README does not contain specific details regarding notable contributors, sponsorships, or community channels (e.g., Discord, Slack).
Licensing & Compatibility
Limitations & Caveats
general.file_type conventions.2 weeks ago
1 day
antirez
cli99
stochasticai
google