Discover and explore top open-source AI tools and projects—updated daily.
keldenlAPI wrapper for local LLM inference, emulating OpenAI's GPT endpoints
Top 54.6% on SourcePulse
This project provides a local API server that emulates OpenAI's GPT endpoints, allowing GPT-powered applications to run with local llama.cpp models. It targets developers and users seeking cost savings, enhanced privacy, and offline capabilities for their AI applications.
How It Works
gpt-llama.cpp acts as a middleware, routing requests intended for OpenAI's GPT APIs to a local instance of llama.cpp. This approach enables seamless integration with existing GPT-based applications by presenting a familiar API interface. It leverages llama.cpp's efficient C++ implementation for local model inference, supporting various quantization levels and model architectures.
Quick Start & Requirements
cd gpt-llama.cpp, npm install.llama.cpp installation. Follow the llama.cpp README for setup on macOS (ARM/Intel) or Windows. Python dependencies are installed via pip install -r requirements.txt within the llama.cpp directory.npm start. Advanced configurations can be passed as arguments (e.g., PORT=8000 npm start mlock threads 8)..bin format).Highlighted Details
llama.cpp improvements.chatbot-ui, Auto-GPT, langchain, DiscGPT, and ChatGPT-Siri.EMBEDDINGS=py).Maintenance & Community
Licensing & Compatibility
Limitations & Caveats
The authentication token for API requests must be set to the absolute path of the local llama model file. The test-installation.sh script is currently only supported on Mac.
2 years ago
Inactive
ChatGPTNextWeb