Discover and explore top open-source AI tools and projects—updated daily.
abacajCPU inference code for MPT-30B
Top 56.1% on SourcePulse
This repository provides Python code for running inference on the MPT-30B model using only a CPU, targeting users who want to leverage large language models without requiring expensive GPUs. It utilizes a ggml quantized model and the ctransformers Python library for efficient CPU execution.
How It Works
The project leverages ggml, a C library for machine learning that enables efficient tensor operations on CPUs. By using a ggml quantized version of the MPT-30B model, the memory footprint and computational requirements are significantly reduced, making it feasible to run on consumer-grade hardware. The ctransformers library provides Python bindings to ggml, simplifying the integration and inference process.
Quick Start & Requirements
pip install -r requirements.txtpython download_model.pypython inference.pyHighlighted Details
ggml quantized model weights (approx. 19GB download).ctransformers Python library.Maintenance & Community
No specific information on contributors, sponsorships, or community channels is provided in the README.
Licensing & Compatibility
The repository does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project requires a substantial amount of RAM (32GB minimum). Performance benchmarks or comparisons to GPU inference are not yet available.
2 years ago
Inactive
bigcode-project
MDK8888
yandex
google
google