tensorlake  by tensorlakeai

Serverless platform for agentic AI applications and document intelligence

Created 1 year ago
875 stars

Top 41.2% on SourcePulse

GitHubView on GitHub
Project Summary

Tensorlake provides a serverless platform and Document Ingestion API designed to accelerate data extraction from documents and simplify the deployment of scalable, durable agentic applications and AI workflows. It targets developers and data engineers seeking to build applications that process unstructured data efficiently and deploy complex AI agents without managing underlying infrastructure. The primary benefit is rapid development and deployment of high-throughput, resilient data processing pipelines and AI-driven workflows.

How It Works

The platform comprises two core components: a Document Ingestion API and an Agentic Runtime. The Document Ingestion API leverages state-of-the-art layout detection and table recognition models to parse various document formats (PDFs, DOCX, spreadsheets, images, text) into Markdown or extract structured data using defined schemas. It offers extensive customization for parsing, including table output modes and summarization. The Agentic Runtime provides a serverless, durable execution environment for Python-based agentic applications. It features sandboxed compute infrastructure that scales automatically, eliminating the need for developers to manage queues, background jobs, or retry logic.

Quick Start & Requirements

Highlighted Details

  • Document Ingestion features state-of-the-art layout detection and table recognition models, with benchmarks available.
  • Agentic Applications offer durable execution, sandboxed compute, and automatic scaling, abstracting away infrastructure concerns.
  • Customizable parsing options include strike-through detection, table output modes (e.g., HTML), figure/table summarization, and signature detection.
  • Structured data extraction supports both Pydantic models and JSON schemas for precise field retrieval.

Maintenance & Community

A Slack community channel is available for support and discussion. The README does not specify notable contributors, sponsorships, or a public roadmap.

Licensing & Compatibility

The license type is indicated via a link to a LICENSE file in the repository. Specific compatibility notes for commercial use or closed-source linking are not detailed in the README.

Limitations & Caveats

Example code snippets necessitate the OPENAI_API_KEY environment variable. The platform appears to be primarily a cloud-based offering. Detailed performance benchmarks or specific unsupported document types are not explicitly listed.

Health Check
Last Commit

4 days ago

Responsiveness

Inactive

Pull Requests (30d)
18
Issues (30d)
9
Star History
12 stars in the last 30 days

Explore Similar Projects

Starred by Wes McKinney Wes McKinney(Author of Pandas), Chip Huyen Chip Huyen(Author of "AI Engineering", "Designing Machine Learning Systems"), and
22 more.

autogen by microsoft

0.5%
54k
Agentic framework for multi-agent AI applications
Created 2 years ago
Updated 5 days ago
Feedback? Help us improve.