localllm  by GoogleCloudPlatform

CLI tool for running LLMs locally on Cloud Workstations

created 1 year ago
1,554 stars

Top 27.3% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides a tool, local-llm, for running quantized Large Language Models (LLMs) locally, primarily targeting Google Cloud Workstations. It simplifies the deployment and interaction with LLMs like Llama-2, offering a managed environment for developers and researchers to experiment with these models without complex local setup.

How It Works

The project leverages llama-cpp-python's webserver to serve quantized LLMs. It provides a Dockerfile for building a custom Cloud Workstations image and a CLI tool (local-llm) for managing model downloads, serving, and interaction. The CLI abstracts away the complexities of model loading and API exposure, allowing users to run models with simple commands.

Quick Start & Requirements

  • Installation: pip3 install ./local-llm/. (after cloning the repo)
  • Prerequisites: Google Cloud Project, gcloud CLI, Docker. The setup involves creating Cloud Workstation clusters and configurations, which can take up to 20 minutes.
  • Recommended Machine Type: e2-standard-32 (32 vCPU, 16 core, 128 GB memory).
  • Model Cache: Assumes models are downloaded to ~/.cache/huggingface/hub/ and supports .gguf files.
  • Documentation: OpenAPI documentation for interacting with served models.

Highlighted Details

  • Streamlined deployment on Google Cloud Workstations via a comprehensive gcloud command sequence.
  • local-llm CLI for managing model lifecycle: run, list, ps, kill, pull, rm.
  • Supports specific model files (e.g., quantized versions like Q4_K_S.gguf).
  • Includes a querylocal.py script for direct model interaction.

Maintenance & Community

  • Developed by Google Cloud Platform.
  • No explicit community links (Discord, Slack) or roadmap mentioned in the README.

Licensing & Compatibility

  • The README does not specify a license for the local-llm tool itself.
  • It facilitates running freely available LLMs, implying compatibility with their respective licenses.

Limitations & Caveats

The project is primarily designed for Google Cloud Workstations, and running it "locally" outside this environment might require manual setup of dependencies. The README also includes a disclaimer regarding the verification and liability of content generated by the LLMs.

Health Check
Last commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
17 stars in the last 90 days

Explore Similar Projects

Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Alexey Milovidov Alexey Milovidov(Cofounder of Clickhouse), and
7 more.

OpenLLM by bentoml

0.2%
12k
SDK for running open-source LLMs as OpenAI-compatible APIs
created 2 years ago
updated 4 days ago
Feedback? Help us improve.