SDK for finetuning LLMs on consumer GPUs
Top 48.5% on sourcepulse
LLMTools provides a Python library for finetuning and running Large Language Models (LLMs) on consumer-grade GPUs with significantly reduced memory requirements. It targets researchers and developers needing to adapt LLMs in low-resource environments, enabling efficient finetuning using novel quantization techniques.
How It Works
LLMTools leverages the ModuLoRA algorithm, which integrates low-precision LoRA finetuning with modular quantizers. This approach allows for finetuning LLMs at 2-bit, 3-bit, and 4-bit precision, a significant advancement over previous methods. The library offers a modular architecture supporting various LLMs, quantizers (like QuIP# and OPTQ), and optimization algorithms, facilitating easy experimentation and integration with the HuggingFace ecosystem.
Quick Start & Requirements
git clone --recursive
), set up a conda environment (Python 3.9.18, PyTorch 2.1.1 with CUDA 12.1), install dependencies from requirements.txt
, and then run python setup.py install
for both quiptools
and llmtools
.Highlighted Details
Maintenance & Community
This is a research project from Cornell University. Feedback can be sent to Junjie Oscar Yin and Volodymyr Kuleshov. The project cites foundational work from Relax-ML Lab, HuggingFace PEFT, and LLAMA/OPT/BLOOM models.
Licensing & Compatibility
The repository's license is not explicitly stated in the README, but it is based on other projects which may have their own licenses. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
This is experimental work in progress. Out-of-the-box support for additional LLMs and quantizers is still under development. The README does not specify the exact license.
1 year ago
Inactive