languagemodels  by jncraton

Python SDK for local LLM exploration

created 2 years ago
1,197 stars

Top 33.4% on sourcepulse

GitHubView on GitHub
Project Summary

This Python package provides building blocks for exploring large language models (LLMs) locally, even on systems with as little as 512MB of RAM. It targets developers, researchers, and educators seeking a simple, privacy-preserving way to integrate LLM capabilities into applications, offering faster CPU inference than Hugging Face Transformers.

How It Works

The package leverages CTranslate2 and int8 quantization for efficient CPU inference, significantly reducing memory footprint and improving speed. Users can adjust the max_ram configuration to select progressively larger and more capable models, with options for GPU acceleration via CUDA. It also includes utilities for external data retrieval (web, weather, date) and semantic document storage for context augmentation.

Quick Start & Requirements

  • Install via pip: pip install languagemodels
  • Initial run downloads ~250MB of model data.
  • Optional GPU acceleration requires NVIDIA GPU with CUDA.
  • See examples for usage.

Highlighted Details

  • Outperforms Hugging Face Transformers on CPU inference (11s vs 22s for 20 questions, 0.34GB vs 1.77GB RAM).
  • Supports instruction following, text completion, and semantic search.
  • Includes external retrieval helpers for real-time data and web content.
  • Configurable model selection based on available RAM, from 248M to 7B parameters.

Maintenance & Community

The project appears to be maintained by a single author, jncraton. There are no explicit links to community channels or roadmaps provided in the README.

Licensing & Compatibility

The package itself is licensed for commercial use. However, users must verify the licenses of the specific models used, as they may not be commercially compatible. The require_model_license function can filter models by license type (e.g., "apache|bsd|mit").

Limitations & Caveats

The default models are significantly smaller than state-of-the-art LLMs and are primarily intended for learning and exploration. Quantization, while improving performance, may negligibly impact output quality. Model license compatibility for commercial use must be independently verified.

Health Check
Last commit

1 week ago

Responsiveness

1 day

Pull Requests (30d)
3
Issues (30d)
2
Star History
9 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.