instructor-js  by 567-labs

Typescript tool for structured extraction from LLMs

created 1 year ago
738 stars

Top 47.9% on sourcepulse

GitHubView on GitHub
Project Summary

This library provides structured data extraction from Large Language Models (LLMs) using TypeScript, OpenAI's function calling API, and Zod for schema validation. It's designed for developers needing to reliably parse LLM outputs into typed data structures, offering simplicity, transparency, and control over the extraction process.

How It Works

Instructor extends the OpenAI SDK client, enabling structured extraction by leveraging Zod schemas. It supports multiple modes (TOOLS, JSON, MD_JSON, JSON_SCHEMA) to guide LLM output formatting. The core mechanism involves passing a Zod schema to the response_model parameter in chat.completions.create, allowing the LLM to generate output that conforms to the defined structure, which is then validated and parsed by Zod.

Quick Start & Requirements

  • Install with bun add @instructor-ai/instructor zod openai, npm i @instructor-ai/instructor zod openai, or pnpm add @instructor-ai/instructor zod openai.
  • Requires Node.js environment, an OpenAI API key, and optionally other provider API keys.
  • Official documentation: https://github.com/567-labs/instructor-js

Highlighted Details

  • Supports streaming of partial extraction results.
  • Integrates with various LLM providers (Anyscale, Together, Anthropic, Azure, Cohere) via llm-polyglot.
  • Built on Island AI toolkit packages: zod-stream, schema-stream, llm-polyglot.
  • Leverages Zod for robust, customizable data validation.

Maintenance & Community

  • Developed by Dimitri Kennedy (creator of Island AI) and Jason Liu (author of original Python Instructor).
  • Community support and contributions are encouraged via GitHub issues.
  • Ports available for Python and Elixir.

Licensing & Compatibility

  • MIT License.
  • Compatible with commercial use and closed-source applications.

Limitations & Caveats

The library relies on LLM providers correctly implementing OpenAI's API specifications for seamless integration. Specific model capabilities and adherence to tool/function calling formats can influence extraction accuracy.

Health Check
Last commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
1
Issues (30d)
1
Star History
30 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.