gpt-llama.cpp by keldenl

API wrapper for local LLM inference, emulating OpenAI's GPT endpoints

Created 2 years ago

596 stars

Top 54.7% on SourcePulse

View on GitHub

2 Experts Love This Project

Teknium

Cofounder of Nous Research

Wing Lian

Founder of Axolotl AI

Project Summary

This project provides a local API server that emulates OpenAI's GPT endpoints, allowing GPT-powered applications to run with local llama.cpp models. It targets developers and users seeking cost savings, enhanced privacy, and offline capabilities for their AI applications.

How It Works

gpt-llama.cpp acts as a middleware, routing requests intended for OpenAI's GPT APIs to a local instance of llama.cpp. This approach enables seamless integration with existing GPT-based applications by presenting a familiar API interface. It leverages llama.cpp's efficient C++ implementation for local model inference, supporting various quantization levels and model architectures.

Quick Start & Requirements

Installation: Clone the repository, cd gpt-llama.cpp, npm install.
Prerequisites: Requires a pre-configured llama.cpp installation. Follow the llama.cpp README for setup on macOS (ARM/Intel) or Windows. Python dependencies are installed via pip install -r requirements.txt within the llama.cpp directory.
Running: npm start. Advanced configurations can be passed as arguments (e.g., PORT=8000 npm start mlock threads 8).
Resources: Requires local model files (e.g., .bin format).
Docs: Swagger API docs are available after starting the server.

Highlighted Details

Drop-in replacement for OpenAI GPT APIs.
Supports interactive mode for faster chat responses.
Automatic adoption of llama.cpp improvements.
Tested compatibility with applications like chatbot-ui, Auto-GPT, langchain, DiscGPT, and ChatGPT-Siri.
Supports embeddings via sentence transformers (EMBEDDINGS=py).

Maintenance & Community

Active development with recent additions like Langchain and Openplayground support.
Community support available via a Discord server.

Licensing & Compatibility

Licensed under the MIT License, permitting commercial use and integration with closed-source applications.

Limitations & Caveats

The authentication token for API requests must be set to the absolute path of the local llama model file. The test-installation.sh script is currently only supported on Mac.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days