mistral-inference  by mistralai

Inference library for Mistral models

created 1 year ago
10,389 stars

Top 4.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the official inference library for Mistral AI's large language models, enabling users to run and interact with models like Mistral 7B, Mixtral 8x7B, and Codestral. It's designed for researchers and developers who need direct control over model execution and integration into custom applications.

How It Works

The library offers a reference implementation for running Mistral models, leveraging PyTorch for efficient computation. It supports various model architectures and features like function calling and multimodal capabilities. The core design prioritizes minimal dependencies for straightforward integration, while also providing options for multi-GPU inference and deployment via vLLM.

Quick Start & Requirements

  • Install via pip: pip install mistral-inference
  • Requires a GPU for installation due to xformers dependency.
  • Model weights must be downloaded separately from provided direct links or Hugging Face Hub.
  • Official Documentation: https://docs.mistral.ai/

Highlighted Details

  • Supports a wide range of Mistral models including 7B, 8x7B, 8x22B, Codestral, Mathstral, and Nemo.
  • Features include function calling, multimodal instruction following, and Fill-in-the-Middle (FIM) for code completion.
  • CLI tools (mistral-demo, mistral-chat) for easy testing and interaction.
  • Deployment options include building a vLLM Docker image.

Maintenance & Community

Licensing & Compatibility

  • Most models are released under permissive licenses allowing commercial use.
  • However, codestral-22B-v0.1.tar and mistral-large-instruct-2407.tar are subject to custom non-commercial licenses (MNPL and MRL respectively).

Limitations & Caveats

  • Installation requires a GPU.
  • Some models have non-commercial use restrictions.
  • Multi-GPU setup is necessary for larger models like 8x7B and 8x22B.
Health Check
Last commit

4 months ago

Responsiveness

Inactive

Pull Requests (30d)
2
Issues (30d)
3
Star History
234 stars in the last 90 days

Explore Similar Projects

Starred by Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), Julien Chaumond Julien Chaumond(Cofounder of Hugging Face), and
1 more.

parallelformers by tunib-ai

0%
790
Toolkit for easy model parallelization
created 4 years ago
updated 2 years ago
Starred by Chip Huyen Chip Huyen(Author of AI Engineering, Designing Machine Learning Systems), Didier Lopes Didier Lopes(Founder of OpenBB), and
10 more.

JARVIS by microsoft

0.1%
24k
System for LLM-orchestrated AI task automation
created 2 years ago
updated 4 days ago
Feedback? Help us improve.