mistral-inference  by mistralai

Inference library for Mistral models

Created 2 years ago
10,469 stars

Top 4.8% on SourcePulse

GitHubView on GitHub
Project Summary

This repository provides the official inference library for Mistral AI's large language models, enabling users to run and interact with models like Mistral 7B, Mixtral 8x7B, and Codestral. It's designed for researchers and developers who need direct control over model execution and integration into custom applications.

How It Works

The library offers a reference implementation for running Mistral models, leveraging PyTorch for efficient computation. It supports various model architectures and features like function calling and multimodal capabilities. The core design prioritizes minimal dependencies for straightforward integration, while also providing options for multi-GPU inference and deployment via vLLM.

Quick Start & Requirements

  • Install via pip: pip install mistral-inference
  • Requires a GPU for installation due to xformers dependency.
  • Model weights must be downloaded separately from provided direct links or Hugging Face Hub.
  • Official Documentation: https://docs.mistral.ai/

Highlighted Details

  • Supports a wide range of Mistral models including 7B, 8x7B, 8x22B, Codestral, Mathstral, and Nemo.
  • Features include function calling, multimodal instruction following, and Fill-in-the-Middle (FIM) for code completion.
  • CLI tools (mistral-demo, mistral-chat) for easy testing and interaction.
  • Deployment options include building a vLLM Docker image.

Maintenance & Community

Licensing & Compatibility

  • Most models are released under permissive licenses allowing commercial use.
  • However, codestral-22B-v0.1.tar and mistral-large-instruct-2407.tar are subject to custom non-commercial licenses (MNPL and MRL respectively).

Limitations & Caveats

  • Installation requires a GPU.
  • Some models have non-commercial use restrictions.
  • Multi-GPU setup is necessary for larger models like 8x7B and 8x22B.
Health Check
Last Commit

6 months ago

Responsiveness

1 week

Pull Requests (30d)
1
Issues (30d)
0
Star History
64 stars in the last 30 days

Explore Similar Projects

Starred by Yaowei Zheng Yaowei Zheng(Author of LLaMA-Factory) and Patrick von Platen Patrick von Platen(Author of Hugging Face Diffusers; Research Engineer at Mistral).

cookbook by mistralai

0.7%
2k
Cookbook with examples using Mistral models
Created 1 year ago
Updated 6 days ago
Starred by Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), Vincent Weisser Vincent Weisser(Cofounder of Prime Intellect), and
15 more.

codellama by meta-llama

0.0%
16k
Inference code for CodeLlama models
Created 2 years ago
Updated 1 year ago
Feedback? Help us improve.