instructor  by 567-labs

SDK for structured LLM outputs using Pydantic models

created 2 years ago
11,096 stars

Top 4.7% on sourcepulse

GitHubView on GitHub
Project Summary

Instructor is a Python library designed to simplify obtaining structured outputs from Large Language Models (LLMs). It targets developers building LLM-powered applications, enabling them to define Pydantic models for desired outputs, handle retries, validate responses, and stream results, thereby streamlining LLM integration and improving output reliability.

How It Works

Instructor leverages Pydantic models to define the schema for LLM outputs. It patches LLM client libraries (like OpenAI's) to inject instructions for the LLM to return JSON conforming to the specified Pydantic model. This approach ensures data validation and type safety, abstracting away the complexities of prompt engineering for structured data extraction.

Quick Start & Requirements

  • Install via pip: pip install -U instructor
  • Requires Python 3.7+
  • Supports various LLM providers including OpenAI, Anthropic, Google Gemini, Mistral, and more.
  • Official documentation: https://docs.instructor.ai/

Highlighted Details

  • Over 1 million monthly downloads, indicating significant community adoption.
  • Supports synchronous and asynchronous operations across multiple LLM providers with a unified interface.
  • Features a hook system for intercepting and logging LLM interaction stages.
  • Includes CLI tools for managing OpenAI fine-tuning jobs, files, and usage monitoring.

Maintenance & Community

  • Actively maintained with contributions from a community of developers.
  • Development environment setup uses uv for faster dependency management.
  • Code quality enforced via ruff for formatting/linting and pyright for type checking.
  • Community engagement channels are available via Discord/Slack.

Licensing & Compatibility

  • Licensed under the MIT License.
  • Permissive license suitable for commercial use and integration into closed-source applications.

Limitations & Caveats

The library's core functionality relies on patching LLM client libraries, which could be subject to breaking changes in upstream provider SDKs. While it supports numerous providers, the quality of structured output generation may vary depending on the underlying LLM's capabilities and the specific provider's tool-calling implementation.

Health Check
Last commit

1 day ago

Responsiveness

Inactive

Pull Requests (30d)
89
Issues (30d)
36
Star History
831 stars in the last 90 days

Explore Similar Projects

Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Travis Fischer Travis Fischer(Founder of Agentic), and
1 more.

instructor-js by 567-labs

0%
738
Typescript tool for structured extraction from LLMs
created 1 year ago
updated 6 months ago
Feedback? Help us improve.