edgen by edgenai

Local GenAI server for private, offline AI

Created 1 year ago

368 stars

Top 76.7% on SourcePulse

View on GitHub

1 Expert Loves This Project

Thomas Wolf

Cofounder of Hugging Face

Project Summary

Edgen provides a local, private GenAI server that acts as a drop-in replacement for OpenAI's API, targeting developers and users who need on-device inference for LLMs, speech-to-text, and other AI models. It offers data privacy, reliability, and cost savings by leveraging user hardware, eliminating the need for cloud infrastructure and API keys.

How It Works

Built in Rust, Edgen is natively compiled for Windows, macOS, and Linux, abstracting the complexities of AI model optimization across different hardware and platforms. It supports model caching to avoid redundant downloads and offers modularity for easy integration of new models and runtimes. The server implements an OpenAI-compatible API, exposing endpoints for chat completions, audio transcriptions, and embeddings, with plans for image generation and multimodal chat.

Quick Start & Requirements

Install/Run: Download and run the binary. The serve command starts the server (default behavior).
Prerequisites: No Docker required. Optional GPU support via Vulkan, CUDA, or Metal requires respective SDKs/Toolkits.
Resources: No specific resource footprint mentioned, but designed for local hardware.
Links: Documentation, Blog, Discord, Roadmap, EdgenChat

Highlighted Details

OpenAI-compatible API for seamless integration.
Supports LLMs (Llama2, Mistral, Mixtral) and Whisper speech-to-text.
Native compilation for Windows, macOS, and Linux; no Docker needed.
Optional GPU acceleration via Vulkan, CUDA, and Metal.

Maintenance & Community

Active development with a public roadmap.
Community support via Discord and GitHub discussions/issues.
Acknowledges contributions from llama.cpp, whisper.cpp, and ggml.

Licensing & Compatibility

License not explicitly stated in the README. Compatibility for commercial or closed-source use is not specified.

Limitations & Caveats

Image generation and multimodal chat completions are listed as future features. The README notes that Vulkan, CUDA, and Metal GPU features cannot be enabled simultaneously.

edgen by edgenai

Explore Similar Projects

swama by Trans-N-ai

llama-box by gpustack

aikit by kaito-project

mlx-omni-server by madroidmaq

local.ai by louisgv

Kokoros by lucasjinreal

chatglm.cpp by li-plus

inference by xorbitsai

llama-cpp-python by abetlen

big-AGI by enricoros

jan by janhq

LocalAI by mudler