llmparser by kyang6

LLM tool for structured data extraction and classification

Created 3 years ago

426 stars

Top 68.8% on SourcePulse

View on GitHub

4 Experts Love This Project

John Resig

Author of jQuery; Chief Software Architect at Khan Academy

Sasha Rush

Research Scientist at Cursor; Professor at Cornell Tech

Bryan Helmig

Cofounder of Zapier

Nicolae Rusan

Cofounder of Magnet, Clay

Project Summary

LLMParser is a tool for classifying text and extracting structured data using Large Language Models (LLMs), specifically addressing the challenge of reliably generating JSON output from LLMs. It is designed for developers and researchers who need to process unstructured text into a predictable format, offering a flexible solution for tasks like resume parsing, contract analysis, and sentiment classification.

How It Works

LLMParser enforces a consistent JSON input and output format for LLM interactions. Users define categories and fields with descriptions, which are then used to prompt the LLM. The library handles the LLM API calls and parses the response, aiming to ensure structured and reliable JSON output, even for complex extraction tasks.

Quick Start & Requirements

Install: npm install llmparser
Prerequisites: OpenAI API key.
Usage: Server-side TypeScript/JavaScript.
Documentation: https://github.com/kyang6/llmparser

Highlighted Details

Enforces consistent JSON input/output for LLMs.
Supports classification and extraction of multiple fields.
Provides confidence scores for extracted data.
Example output demonstrates detailed field extraction with source attribution.

Maintenance & Community

Primarily maintained by kyang6.
No explicit community channels or roadmap links provided in the README.

Licensing & Compatibility

License: Not specified in the README.
Compatibility: Designed for server-side Node.js environments; client-side usage is discouraged due to API key exposure.

Limitations & Caveats

The library relies on external LLM providers (e.g., OpenAI) and requires an API key, incurring associated costs. The effectiveness and reliability of the extraction are dependent on the underlying LLM's capabilities and the quality of the provided category/field descriptions. The license is not specified, which may impact commercial use.

Health Check

Last Commit

3 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days