indexify by tensorlakeai

Engine for data-intensive generative AI apps

Created 2 years ago

1,088 stars

Top 34.9% on SourcePulse

View on GitHub

5 Experts Love This Project

Toran Bruce Richards

Founder of AutoGPT

Elie Bursztein

Cybersecurity Lead at Google DeepMind

Justin Cormack

Former CTO of Docker

Quinn Slack

Cofounder of Sourcegraph

and 1 more!

Project Summary

Indexify is an open-source serving engine designed to simplify the creation and deployment of durable, multi-stage data processing workflows for generative AI applications. It targets developers building data-intensive applications, enabling them to orchestrate complex data pipelines involving tasks like document extraction, embedding, and retrieval across distributed compute resources.

How It Works

Indexify utilizes a graph-based approach where workflows are defined as directed acyclic graphs (DAGs) of functions. Each function, decorated with @tensorlake_function, represents a unit of compute that can be independently deployed and scaled. The engine supports dynamic routing, allowing data to be routed to specialized compute functions based on conditional logic. It also features automatic retries and load balancing for functions, ensuring durability and efficient resource utilization across heterogeneous hardware (CPU/GPU).

Quick Start & Requirements

Install via pip: pip install indexify tensorlake
Requires Python 3.7+
Local testing requires only the tensorlake package.
Deployment involves running the indexify-server and indexify-cli executor components.
Official documentation and examples are available.

Highlighted Details

Supports multi-cloud and multi-region deployment for compute resources.
Enables distributed processing and fine-grained resource allocation (CPU/GPU).
Functions are durable and automatically retried on failure.
Workflows can be exposed as HTTP APIs or Python Remote APIs.

Maintenance & Community

Indexify is the core engine powering Tensorlake's Serverless Workflow Engine. Community channels and roadmap details are not explicitly provided in the README.

Licensing & Compatibility

The project is licensed under the Apache License 2.0, which permits commercial use and linking with closed-source applications.

Limitations & Caveats

The project is described as the "Open-Source core compute engine," suggesting potential differences or additional features in Tensorlake's commercial offering. The roadmap indicates features like cyclic graph support and ephemeral graphs are still under development.

Health Check

Last Commit

2 days ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

10 stars in the last 30 days