instructor by 567-labs

SDK for structured LLM outputs using Pydantic models

Created 2 years ago

12,119 stars

Top 4.1% on SourcePulse

View on GitHub

32 Experts Love This Project

David Cournapeau

Author of scikit-learn

Gregor Zunic

Cofounder of Browser Use

Jason Liu

Author of Instructor

Will Brown

Research Lead at Prime Intellect

and 28 more!

Project Summary

Instructor is a Python library designed to simplify obtaining structured outputs from Large Language Models (LLMs). It targets developers building LLM-powered applications, enabling them to define Pydantic models for desired outputs, handle retries, validate responses, and stream results, thereby streamlining LLM integration and improving output reliability.

How It Works

Instructor leverages Pydantic models to define the schema for LLM outputs. It patches LLM client libraries (like OpenAI's) to inject instructions for the LLM to return JSON conforming to the specified Pydantic model. This approach ensures data validation and type safety, abstracting away the complexities of prompt engineering for structured data extraction.

Quick Start & Requirements

Install via pip: pip install -U instructor
Requires Python 3.7+
Supports various LLM providers including OpenAI, Anthropic, Google Gemini, Mistral, and more.
Official documentation: https://docs.instructor.ai/

Highlighted Details

Over 1 million monthly downloads, indicating significant community adoption.
Supports synchronous and asynchronous operations across multiple LLM providers with a unified interface.
Features a hook system for intercepting and logging LLM interaction stages.
Includes CLI tools for managing OpenAI fine-tuning jobs, files, and usage monitoring.

Maintenance & Community

Actively maintained with contributions from a community of developers.
Development environment setup uses uv for faster dependency management.
Code quality enforced via ruff for formatting/linting and pyright for type checking.
Community engagement channels are available via Discord/Slack.

Licensing & Compatibility

Licensed under the MIT License.
Permissive license suitable for commercial use and integration into closed-source applications.

Limitations & Caveats

The library's core functionality relies on patching LLM client libraries, which could be subject to breaking changes in upstream provider SDKs. While it supports numerous providers, the quality of structured output generation may vary depending on the underlying LLM's capabilities and the specific provider's tool-calling implementation.

Health Check

Last Commit

16 hours ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

169 stars in the last 30 days