langchain-extract by langchain-ai

FastAPI web server for LLM-powered data extraction

Created 1 year ago

1,178 stars

Top 32.9% on SourcePulse

Project Summary

This project provides a FastAPI web server for extracting structured information from text and files using Large Language Models (LLMs). It's designed as a reference implementation and starting point for developers building custom data extraction applications, offering a REST API, JSON schema definition for extraction targets, and support for few-shot examples to improve accuracy.

How It Works

The server leverages LangChain for LLM orchestration and FastAPI for its web framework. Extraction logic is defined via JSON schemas, allowing users to specify the desired output structure. The system supports incorporating few-shot examples, provided via a separate API endpoint, to guide the LLM and enhance the quality of extracted results. It stores extractors and examples in a PostgreSQL database.

Quick Start & Requirements

Install/Run: Use docker compose build and docker compose up.
Prerequisites: OpenAI API key (required), Fireworks or Together API keys (optional for additional models).
Setup: Requires Docker. API key configuration via .local.env.
Docs: extract.langchain.com

Highlighted Details

REST API with OpenAPI documentation.
Supports extraction from text and binary files (e.g., HTML, PDF).
LangServe endpoint for integration with LangChain RemoteRunnable.
Ability to create, save, and manage extractors and examples in a database.

Maintenance & Community

This project is under active development by LangChain AI. While pull requests are not currently accepted, feedback via issues and discussions is encouraged.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. This requires further investigation for commercial use or closed-source linking.

Limitations & Caveats

The project is under active development, with breaking changes expected between releases. The main branch should not be used directly; checkout releases instead. User authentication is not implemented, with access controlled by a user ID generated via uuidgen.

langchain-extract by langchain-ai

Explore Similar Projects

apicat by apicat

langchain-swift by buhe

workgpt by team-openpm

instructor-js by 567-labs

gpt4all-datalake by nomic-ai

llm-api-engine by developersdigest

openapi-mcp-server by janwilmake

kor by eyurtsev

ontogpt by monarch-initiative

rag-web-ui by rag-web-ui

firecrawl by firecrawl

private-gpt by zylon-ai