txtai by neuml

All-in-one AI framework for semantic search, LLM orchestration, and language model workflows

Created 5 years ago

12,210 stars

Top 4.1% on SourcePulse

View on GitHub

16 Experts Love This Project

Tobi Lutke

Cofounder of Shopify

Chip Huyen

Author of "AI Engineering", "Designing Machine Learning Systems"

and 12 more!

Project Summary

txtai is an all-in-one AI framework designed for semantic search, LLM orchestration, and language model workflows. It targets developers and researchers building applications like autonomous agents, retrieval augmented generation (RAG) systems, and complex multi-model pipelines. The framework's core benefit is its unified approach to integrating various AI capabilities, simplifying the development of sophisticated AI-powered applications.

How It Works

txtai's foundation is an embeddings database that combines vector indexes (sparse and dense), graph networks, and relational databases. This architecture enables powerful vector search and acts as a knowledge source for LLM applications. It supports creating embeddings for diverse data types (text, audio, images, video) and orchestrating complex tasks through pipelines and workflows, which can be chained together and aggregated with business logic. Agents built on this framework can autonomously solve problems by connecting these components.

Quick Start & Requirements

Install via pip: pip install txtai
Requires Python 3.10+.
Official documentation: https://github.com/neuml/txtai#documentation
Example notebooks: https://github.com/neuml/txtai/tree/main/examples

Highlighted Details

Supports semantic search with SQL, object storage, topic modeling, graph analysis, and multimodal indexing.
Offers pipelines for LLM prompts, QA, labeling, transcription, translation, and summarization.
Enables agent-based autonomous problem-solving using the smolagents framework.
Provides Web and Model Context Protocol (MCP) APIs with bindings for JavaScript, Java, Rust, and Go.
Includes default configurations for rapid setup and can scale via container orchestration.

Maintenance & Community

Actively maintained with regular updates (versions 8.0, 7.0, etc. mentioned).
Extensive example notebooks and tutorials available on dev.to and Hashnode.
Contribution guide provided for community involvement.

Licensing & Compatibility

Licensed under the Apache 2.0 license.
Models recommended for commercial use are available.
Compatible with various LLM frameworks like llama.cpp and LiteLLM.

Limitations & Caveats

The framework supports Python 3.10+, and while it offers many default models, users may need to install optional dependencies for specific functionalities or advanced use cases. The breadth of features means a learning curve for mastering all capabilities.

Health Check

Last Commit

1 day ago

Responsiveness

1 day

Pull Requests (30d)

Issues (30d)

Star History

167 stars in the last 30 days