gpt-llama.cpp  by keldenl

API wrapper for local LLM inference, emulating OpenAI's GPT endpoints

created 2 years ago
598 stars

Top 55.3% on sourcepulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This project provides a local API server that emulates OpenAI's GPT endpoints, allowing GPT-powered applications to run with local llama.cpp models. It targets developers and users seeking cost savings, enhanced privacy, and offline capabilities for their AI applications.

How It Works

gpt-llama.cpp acts as a middleware, routing requests intended for OpenAI's GPT APIs to a local instance of llama.cpp. This approach enables seamless integration with existing GPT-based applications by presenting a familiar API interface. It leverages llama.cpp's efficient C++ implementation for local model inference, supporting various quantization levels and model architectures.

Quick Start & Requirements

  • Installation: Clone the repository, cd gpt-llama.cpp, npm install.
  • Prerequisites: Requires a pre-configured llama.cpp installation. Follow the llama.cpp README for setup on macOS (ARM/Intel) or Windows. Python dependencies are installed via pip install -r requirements.txt within the llama.cpp directory.
  • Running: npm start. Advanced configurations can be passed as arguments (e.g., PORT=8000 npm start mlock threads 8).
  • Resources: Requires local model files (e.g., .bin format).
  • Docs: Swagger API docs are available after starting the server.

Highlighted Details

  • Drop-in replacement for OpenAI GPT APIs.
  • Supports interactive mode for faster chat responses.
  • Automatic adoption of llama.cpp improvements.
  • Tested compatibility with applications like chatbot-ui, Auto-GPT, langchain, DiscGPT, and ChatGPT-Siri.
  • Supports embeddings via sentence transformers (EMBEDDINGS=py).

Maintenance & Community

  • Active development with recent additions like Langchain and Openplayground support.
  • Community support available via a Discord server.

Licensing & Compatibility

  • Licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The authentication token for API requests must be set to the absolute path of the local llama model file. The test-installation.sh script is currently only supported on Mac.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
0 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.