localllm by GoogleCloudPlatform

CLI tool for running LLMs locally on Cloud Workstations

Created 2 years ago

1,550 stars

Top 26.6% on SourcePulse

View on GitHub

5 Experts Love This Project

Gabriel Almeida

Cofounder of Langflow

Cofounder of Hugging Face

and 1 more!

Project Summary

This repository provides a tool, local-llm, for running quantized Large Language Models (LLMs) locally, primarily targeting Google Cloud Workstations. It simplifies the deployment and interaction with LLMs like Llama-2, offering a managed environment for developers and researchers to experiment with these models without complex local setup.

How It Works

The project leverages llama-cpp-python's webserver to serve quantized LLMs. It provides a Dockerfile for building a custom Cloud Workstations image and a CLI tool (local-llm) for managing model downloads, serving, and interaction. The CLI abstracts away the complexities of model loading and API exposure, allowing users to run models with simple commands.

Quick Start & Requirements

Installation: pip3 install ./local-llm/. (after cloning the repo)
Prerequisites: Google Cloud Project, gcloud CLI, Docker. The setup involves creating Cloud Workstation clusters and configurations, which can take up to 20 minutes.
Recommended Machine Type: e2-standard-32 (32 vCPU, 16 core, 128 GB memory).
Model Cache: Assumes models are downloaded to ~/.cache/huggingface/hub/ and supports .gguf files.
Documentation: OpenAPI documentation for interacting with served models.

Highlighted Details

Streamlined deployment on Google Cloud Workstations via a comprehensive gcloud command sequence.
local-llm CLI for managing model lifecycle: run, list, ps, kill, pull, rm.
Supports specific model files (e.g., quantized versions like Q4_K_S.gguf).
Includes a querylocal.py script for direct model interaction.

Maintenance & Community

Developed by Google Cloud Platform.
No explicit community links (Discord, Slack) or roadmap mentioned in the README.

Licensing & Compatibility

The README does not specify a license for the local-llm tool itself.
It facilitates running freely available LLMs, implying compatibility with their respective licenses.

Limitations & Caveats

The project is primarily designed for Google Cloud Workstations, and running it "locally" outside this environment might require manual setup of dependencies. The README also includes a disclaimer regarding the verification and liability of content generated by the LLMs.

Health Check

Last Commit

1 year ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days