mistral-common  by mistralai

Inference library for Mistral models preprocessing

Created 1 year ago
793 stars

Top 44.3% on SourcePulse

GitHubView on GitHub
Project Summary

This library provides the official inference tools for Mistral AI models, focusing on advanced tokenization for structured conversations and tool parsing. It's designed for developers and researchers working with Mistral's diverse model ecosystem, offering efficient pre-processing and validation capabilities.

How It Works

The library implements custom tokenizers (v1, v2, v3) that go beyond standard text-to-token conversion. They are specifically designed to parse and handle structured data, including tool calls and conversational formats, which is crucial for instruction-following and function-calling models. This approach allows for more robust and accurate interaction with Mistral's models.

Quick Start & Requirements

  • Install via pip: pip install mistral-common
  • Alternatively, install from source using Poetry: poetry install
  • Requires Python.

Highlighted Details

  • Supports tokenization for various Mistral models including Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Codestral, and Mathstral.
  • Includes validation and normalization code used in the Mistral API.
  • Provides examples for tokenizing chat completion requests with tool definitions.

Maintenance & Community

  • Official library from Mistral AI.

Licensing & Compatibility

  • License not specified in the README.

Limitations & Caveats

The specific license for this repository is not detailed in the README, which may impact commercial use or integration into closed-source projects.

Health Check
Last Commit

3 days ago

Responsiveness

1 day

Pull Requests (30d)
16
Issues (30d)
3
Star History
15 stars in the last 30 days

Explore Similar Projects

Feedback? Help us improve.