openrouter-runner  by OpenRouterTeam

Inference engine for deploying open-source models

Created 2 years ago
1,163 stars

Top 33.3% on SourcePulse

GitHubView on GitHub
Project Summary

This project provides an inference engine for deploying open-source AI models on OpenRouter.ai, targeting developers and researchers who want to host and serve models efficiently. It aims to make model deployment faster and cheaper, with incentives for performance improvements.

How It Works

The OpenRouter Runner is a monolithic inference engine built on the Modal platform. It leverages Modal's serverless infrastructure to deploy and manage various open-source models. The core design involves defining model-specific containers (e.g., using vLLM) and registering them within the runner's configuration. This approach allows for scalable and on-demand inference, abstracting away much of the underlying infrastructure management.

Quick Start & Requirements

  • Install/Run: poetry install, poetry shell, modal token new, modal environment create dev, modal config set-environment dev, modal secret create ..., modal run runner::download, modal deploy runner.
  • Prerequisites: Modal account, Hugging Face account (with token), Poetry installed.
  • Setup: Requires configuring Modal secrets for Hugging Face token, runner API key, and optionally Sentry/Datadog. Model download can take time depending on size.
  • Docs: Adding Models To OpenRouter (Video)

Highlighted Details

  • Serves as the inference engine for openrouter.ai.
  • Incentivizes community contributions for faster and cheaper model serving.
  • Supports adding new models by updating configuration or creating new container definitions.
  • Includes a testing framework with example scripts.

Maintenance & Community

  • Open to contributions for adding more open-source models.
  • Encourages adherence to a code of conduct for community health.

Licensing & Compatibility

  • The README does not explicitly state a license. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

Creating new containers for models with unique requirements can be complex. The project does not explicitly state its license, which may impact commercial adoption.

Health Check
Last Commit

1 month ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
15 stars in the last 30 days

Explore Similar Projects

Starred by Amanpreet Singh Amanpreet Singh(Cofounder of Contextual AI), Yineng Zhang Yineng Zhang(Inference Lead at SGLang; Research Scientist at Together AI), and
7 more.

truss by basetenlabs

0.1%
1k
Model deployment tool for productionizing AI/ML models
Created 3 years ago
Updated 18 hours ago
Feedback? Help us improve.