mistral-inference by mistralai

Inference library for Mistral models

Created 2 years ago

10,615 stars

Top 4.8% on SourcePulse

13 Experts Love This Project

vincentweisser

Vincent Weisser

Cofounder of Prime Intellect

parano

Founder of Bento

spartee

Cofounder of Arcade

transitive-bullshit

Founder of Agentic

and 9 more!

Project Summary

This repository provides the official inference library for Mistral AI's large language models, enabling users to run and interact with models like Mistral 7B, Mixtral 8x7B, and Codestral. It's designed for researchers and developers who need direct control over model execution and integration into custom applications.

How It Works

The library offers a reference implementation for running Mistral models, leveraging PyTorch for efficient computation. It supports various model architectures and features like function calling and multimodal capabilities. The core design prioritizes minimal dependencies for straightforward integration, while also providing options for multi-GPU inference and deployment via vLLM.

Quick Start & Requirements

Install via pip: pip install mistral-inference
Requires a GPU for installation due to xformers dependency.
Model weights must be downloaded separately from provided direct links or Hugging Face Hub.
Official Documentation: https://docs.mistral.ai/

Highlighted Details

Supports a wide range of Mistral models including 7B, 8x7B, 8x22B, Codestral, Mathstral, and Nemo.
Features include function calling, multimodal instruction following, and Fill-in-the-Middle (FIM) for code completion.
CLI tools (mistral-demo, mistral-chat) for easy testing and interaction.
Deployment options include building a vLLM Docker image.

Maintenance & Community

Active development by Mistral AI.
Community support available via Discord: https://discord.com/invite/mistralai

Licensing & Compatibility

Most models are released under permissive licenses allowing commercial use.
However, codestral-22B-v0.1.tar and mistral-large-instruct-2407.tar are subject to custom non-commercial licenses (MNPL and MRL respectively).

Limitations & Caveats

Installation requires a GPU.
Some models have non-commercial use restrictions.
Multi-GPU setup is necessary for larger models like 8x7B and 8x22B.

Health Check

Last Commit

1 month ago

Responsiveness

1 week

Pull Requests (30d)

1

Issues (30d)

0

Star History

45 stars in the last 30 days

Explore Similar Projects

ViP-LLaVA by WisconsinAIVision

Multimodal model for understanding visual prompts

Created 2 years ago

Updated 1 year ago

Starred by

Georgios Konstantopoulos

Georgios Konstantopoulos(CTO, General Partner at Paradigm).

rigging by dreadnode

LLM interaction framework for production code

Created 1 year ago

Updated 21 hours ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory) and

Abubakar Abid

Abubakar Abid(Cofounder of Gradio).

Yi-1.5 by 01-ai

Yi-1.5: upgraded open-source language model series

Created 1 year ago

Updated 1 year ago

code-interpreter by haseeb-heaven

Open-source code interpreter alternative, CLI tool for code generation/execution

Created 2 years ago

Updated 7 months ago

Starred by

Tobi Lutke

Tobi Lutke(Cofounder of Shopify),

Lysandre Debut

Lysandre Debut(Chief Open-Source Officer at Hugging Face), and

6 more.

mistral-common by mistralai

Inference library for Mistral models preprocessing

Created 1 year ago

Updated 6 days ago

wtf-langchain by sugarforever

LangChain tutorial for the Python version using OpenAI models

Created 2 years ago

Updated 1 year ago

Starred by

Travis Fischer

Travis Fischer(Founder of Agentic).

napkins by Nutlope

Web app for generating code from wireframe screenshots

Created 1 year ago

Updated 3 weeks ago

Starred by

Nat Friedman

Nat Friedman(Former CEO of GitHub),

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory), and

11 more.

CodeGen by salesforce

Open-source model family for program synthesis

Created 3 years ago

Updated 2 months ago

Starred by

Yaowei Zheng

Yaowei Zheng(Author of LLaMA-Factory) and

Patrick von Platen

Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral).

cookbook by mistralai

Cookbook with examples using Mistral models

Created 2 years ago

Updated 2 days ago

Starred by

Zack Li

Zack Li(Cofounder of Nexa AI),

Alex Chen

Alex Chen(Cofounder of Nexa AI), and

1 more.

nexa-sdk by NexaAI

Nexa SDK: local inference framework for GGML/ONNX models

Created 1 year ago

Updated 1 day ago

Starred by

Victor Taelin

Victor Taelin(Author of Bend, Kind, HVM) and

Lianmin Zheng

Lianmin Zheng(Coauthor of SGLang, vLLM).

ChatGLM3 by zai-org

Bilingual chat LLM for complex scenarios (tool use, code execution, agents)

Created 2 years ago

Updated 1 year ago

Starred by

Chip Huyen

Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"),

Vincent Weisser

Vincent Weisser(Cofounder of Prime Intellect), and

15 more.

codellama by meta-llama

Inference code for CodeLlama models

Created 2 years ago

Updated 1 year ago

Feedback? Help us improve.