edgen  by edgenai

Local GenAI server for private, offline AI

Created 1 year ago
367 stars

Top 76.8% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

Edgen provides a local, private GenAI server that acts as a drop-in replacement for OpenAI's API, targeting developers and users who need on-device inference for LLMs, speech-to-text, and other AI models. It offers data privacy, reliability, and cost savings by leveraging user hardware, eliminating the need for cloud infrastructure and API keys.

How It Works

Built in Rust, Edgen is natively compiled for Windows, macOS, and Linux, abstracting the complexities of AI model optimization across different hardware and platforms. It supports model caching to avoid redundant downloads and offers modularity for easy integration of new models and runtimes. The server implements an OpenAI-compatible API, exposing endpoints for chat completions, audio transcriptions, and embeddings, with plans for image generation and multimodal chat.

Quick Start & Requirements

  • Install/Run: Download and run the binary. The serve command starts the server (default behavior).
  • Prerequisites: No Docker required. Optional GPU support via Vulkan, CUDA, or Metal requires respective SDKs/Toolkits.
  • Resources: No specific resource footprint mentioned, but designed for local hardware.
  • Links: Documentation, Blog, Discord, Roadmap, EdgenChat

Highlighted Details

  • OpenAI-compatible API for seamless integration.
  • Supports LLMs (Llama2, Mistral, Mixtral) and Whisper speech-to-text.
  • Native compilation for Windows, macOS, and Linux; no Docker needed.
  • Optional GPU acceleration via Vulkan, CUDA, and Metal.

Maintenance & Community

  • Active development with a public roadmap.
  • Community support via Discord and GitHub discussions/issues.
  • Acknowledges contributions from llama.cpp, whisper.cpp, and ggml.

Licensing & Compatibility

  • License not explicitly stated in the README. Compatibility for commercial or closed-source use is not specified.

Limitations & Caveats

Image generation and multimodal chat completions are listed as future features. The README notes that Vulkan, CUDA, and Metal GPU features cannot be enabled simultaneously.

Health Check
Last Commit

1 year ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
3 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.