calculate-flops.pytorch by MrYxJ

PyTorch tool to calculate FLOPs, MACs, and parameters for neural networks

Created 2 years ago

900 stars

Top 40.2% on SourcePulse

View on GitHub

1 Expert Loves This Project

Yaowei Zheng

Author of LLaMA-Factory

Project Summary

This library calculates FLOPs, MACs, and parameters for various neural networks in PyTorch, including CNNs, RNNs, GCNs, and Transformers (like BERT and LLaMA). It's useful for researchers and engineers needing to understand model computational complexity and resource consumption, offering detailed breakdowns per submodule.

How It Works

The tool instruments PyTorch models to count operations. For Hugging Face models, it can perform calculations remotely without downloading weights, leveraging meta-device inference or requiring a tokenizer for input generation. It supports custom models by tracing operations through torch.nn.functional.

Quick Start & Requirements

Install via pip: pip install --upgrade calflops
For Hugging Face models, an access token may be required.
Official docs: https://github.com/MrYxJ/calculate-flops.pytorch
Hugging Face Space demo: https://huggingface.co/spaces/MrYXJ/calculate-model-flops

Highlighted Details

Supports calculating FLOPs for Hugging Face models by name, avoiding local downloads.
Can include backward pass FLOPs with a configurable factor.
Provides detailed breakdowns of FLOPs, MACs, and parameters for each submodule.
Offers flexible output formatting (string, precision, units).

Maintenance & Community

The project is maintained by MrYxJ. Further community engagement details are not explicitly provided in the README.

Licensing & Compatibility

The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The tool's accuracy relies on PyTorch's tracing capabilities and may not perfectly capture all hardware-specific optimizations. Some models might require specific tokenizer configurations for accurate input generation. The README mentions that FLOPs for activation recomputation are not included by default but can be approximated by multiplying forward FLOPs by 4 or setting compute_bp_factor=3.

Health Check

Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

13 stars in the last 30 days