koila  by rentruewang

Tool to prevent CUDA out-of-memory errors in PyTorch

created 3 years ago
1,824 stars

Top 24.2% on sourcepulse

GitHubView on GitHub
Project Summary

Koila is a Python library designed to prevent PyTorch's common "CUDA out of memory" errors with a single line of code. It targets PyTorch users, particularly those encountering memory limitations during model training, by automatically managing batch sizes and optimizing computation.

How It Works

Koila acts as a lightweight wrapper around PyTorch tensors. It employs a lazy evaluation strategy, similar to TensorFlow's static graphs, to build a computational graph before execution. By analyzing the shapes of intermediate tensors, Koila can predict memory requirements and dynamically adjust batch sizes to fit available GPU memory, preventing out-of-memory errors. It also automatically splits batches into powers of two for potential speedups.

Quick Start & Requirements

  • Install via pip: pip install koila
  • Requires PyTorch.
  • Refer to the v0.1.1 tag for a proof-of-concept.

Highlighted Details

  • Prevents CUDA out-of-memory errors with a single line of code.
  • Automatically accumulates gradients for large batch sizes.
  • Lazily evaluates PyTorch code to save computing power.
  • Splits batch dimensions automatically for GPU efficiency.

Maintenance & Community

The project is currently undergoing a significant re-structure, with the main branch being largely empty. The v0.1.1 tag represents a working proof-of-concept. The project is available under the Apache License.

Licensing & Compatibility

  • License: Apache License.
  • Compatible with PyTorch code.

Limitations & Caveats

The library is a work in progress and not yet fully PyTorch compatible due to limited development time. It is not recommended for production environments. The main branch is mostly empty due to ongoing re-structuring.

Health Check
Last commit

2 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Suale Hasif Suale Hasif(Cofounder of Cursor), and
1 more.

attorch by BobMcDear

0.3%
564
PyTorch nn module subset, implemented in Python using Triton
created 2 years ago
updated 2 days ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Zhuohan Li Zhuohan Li(Author of vLLM), and
6 more.

torchtitan by pytorch

0.9%
4k
PyTorch platform for generative AI model training research
created 1 year ago
updated 22 hours ago
Starred by Andrej Karpathy Andrej Karpathy(Founder of Eureka Labs; Formerly at Tesla, OpenAI; Author of CS 231n), Lianmin Zheng Lianmin Zheng(Author of SGLang), and
13 more.

gpt-fast by pytorch-labs

0.1%
6k
PyTorch text generation for efficient transformer inference
created 1 year ago
updated 3 months ago
Starred by Peter Norvig Peter Norvig(Author of Artificial Intelligence: A Modern Approach; Research Director at Google), Didier Lopes Didier Lopes(Founder of OpenBB), and
15 more.

llm.c by karpathy

0.2%
27k
LLM training in pure C/CUDA, no PyTorch needed
created 1 year ago
updated 1 month ago
Feedback? Help us improve.