kor  by eyurtsev

LLM wrapper for structured data extraction

created 2 years ago
1,682 stars

Top 25.8% on sourcepulse

GitHubView on GitHub
Project Summary

Kor is a Python library designed for extracting structured data from text using Large Language Models (LLMs), particularly those without native tool-calling capabilities. It targets developers needing to parse unstructured text into predefined schemas, offering a flexible alternative to newer chat model APIs.

How It Works

Kor operates by generating a prompt that includes a user-defined schema and examples, sending it to a specified LLM, and then parsing the LLM's output. It supports two schema definition styles: Kor's own Object and Text definitions, and Pydantic models. This approach is advantageous as it works with any LLM capable of understanding prompts and generating text, regardless of whether they support advanced features like JSON mode or function calling.

Quick Start & Requirements

Highlighted Details

  • Supports both Kor's custom schema and Pydantic v1/v2 models for defining extraction targets.
  • Offers flexibility by working with LLMs that lack native tool-calling or JSON modes.
  • Can be integrated with the LangChain framework.
  • Performance is dependent on LLM choice; larger, slower models are recommended for better quality.

Maintenance & Community

  • The project is marked as a "half-baked prototype" with an unstable API.
  • Open issues are encouraged for discussion and feature requests.
  • Alternatives like Promptify and MiniChain are suggested.

Licensing & Compatibility

  • The README does not explicitly state a license.
  • Compatibility with commercial use or closed-source linking is not specified.

Limitations & Caveats

Kor is a prototype with an unstable API and is known for being slow and potentially crashing on long text inputs due to context window limitations. Its extraction quality heavily relies on the quality of provided examples and schema documentation.

Health Check
Last commit

6 months ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
17 stars in the last 90 days

Explore Similar Projects

Starred by John Resig John Resig(Author of jQuery; Chief Software Architect at Khan Academy), Travis Fischer Travis Fischer(Founder of Agentic), and
1 more.

instructor-js by 567-labs

0%
738
Typescript tool for structured extraction from LLMs
created 1 year ago
updated 6 months ago
Feedback? Help us improve.