langchain-extract  by langchain-ai

FastAPI web server for LLM-powered data extraction

created 1 year ago
1,148 stars

Top 34.3% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a FastAPI web server for extracting structured information from text and files using Large Language Models (LLMs). It's designed as a reference implementation and starting point for developers building custom data extraction applications, offering a REST API, JSON schema definition for extraction targets, and support for few-shot examples to improve accuracy.

How It Works

The server leverages LangChain for LLM orchestration and FastAPI for its web framework. Extraction logic is defined via JSON schemas, allowing users to specify the desired output structure. The system supports incorporating few-shot examples, provided via a separate API endpoint, to guide the LLM and enhance the quality of extracted results. It stores extractors and examples in a PostgreSQL database.

Quick Start & Requirements

  • Install/Run: Use docker compose build and docker compose up.
  • Prerequisites: OpenAI API key (required), Fireworks or Together API keys (optional for additional models).
  • Setup: Requires Docker. API key configuration via .local.env.
  • Docs: extract.langchain.com

Highlighted Details

  • REST API with OpenAPI documentation.
  • Supports extraction from text and binary files (e.g., HTML, PDF).
  • LangServe endpoint for integration with LangChain RemoteRunnable.
  • Ability to create, save, and manage extractors and examples in a database.

Maintenance & Community

This project is under active development by LangChain AI. While pull requests are not currently accepted, feedback via issues and discussions is encouraged.

Licensing & Compatibility

The repository does not explicitly state a license in the provided README. This requires further investigation for commercial use or closed-source linking.

Limitations & Caveats

The project is under active development, with breaking changes expected between releases. The main branch should not be used directly; checkout releases instead. User authentication is not implemented, with access controlled by a user ID generated via uuidgen.

Health Check
Last commit

1 month ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
26 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.