recipes by vllm-project

LLM inference recipes

Created 7 months ago

453 stars

Top 66.5% on SourcePulse

View on GitHub

2 Experts Love This Project

Simon Mo

Core Maintainer of vLLM

Woosuk Kwon

Coauthor of vLLM

Project Summary

<2-3 sentences summarising what the project addresses and solves, the target audience, and the benefit.> This repository provides a curated collection of community-maintained recipes for running the vLLM inference engine with a wide array of large language models. It targets engineers and researchers seeking practical, ready-to-use configurations for deploying specific models on diverse hardware for various tasks, simplifying the often complex process of model deployment.

How It Works

The project functions as a central hub for practical examples, addressing the common question of how to run specific models (like Llama, Qwen, DeepSeek) with vLLM. Each "recipe" likely contains configuration files, command-line examples, and potentially scripts tailored for particular model architectures, versions, and tasks (e.g., OCR, vision-language). This community-driven approach ensures recipes are relevant and cover a broad spectrum of use cases.

Quick Start & Requirements

To build the documentation locally, users should set up a virtual environment (uv venv), activate it (source .venv/bin/activate), install dependencies (uv pip install -r requirements.txt), and then serve the documentation (uv run mkdocs serve). Specific hardware or software prerequisites for running the models themselves are detailed within individual recipes, not in the main README.

Highlighted Details

Extensive model support, including recipes for DeepSeek, Ernie, GLM, Llama (e.g., Llama3.3-70B, Llama3.1), MiniMax, Moonshotai, OpenAI (gpt-oss), PaddlePaddle, Qwen (e.g., Qwen2.5-VL), Seed, and Tencent-Hunyuan models.
Covers diverse tasks such as OCR and Vision-Language (VL) capabilities.
Community-driven contribution model encourages ongoing expansion and updates.

Maintenance & Community

The repository relies on community contributions via Pull Requests (PRs) to add new recipes or improve existing ones. While specific community channels like Discord/Slack are not mentioned, the contribution model itself fosters a collaborative environment.

Licensing & Compatibility

The project is licensed under the Apache License 2.0. This permissive license generally allows for commercial use and integration into closed-source projects, with standard attribution and notice requirements.

Limitations & Caveats

This repository focuses on running existing models with vLLM and does not provide the models themselves or the vLLM engine. Users are expected to have vLLM installed and to adapt the provided recipes to their specific environment and model weights. The effectiveness of recipes may vary depending on the exact model version and hardware configuration.

Health Check

Last Commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

104 stars in the last 30 days